84

Tags can have multiple attributes. The order in which attributes appear in the code does not matter. For example:

<a href="#" title="#">
<a title="#" href="#">

How can I "normalize" the HTML in Javascript, so the order of the attributes is always the same? I don't care which order is chosen, as long as it is always the same.

UPDATE: my original goal was to make it easier to diff (in JavaScript) 2 HTML pages with slight differences. Because users could use different software to edit the code, the order of the attributes could change. This make the diff too verbose.

ANSWER: Well, first thanks for all the answers. And YES, it is possible. Here is how I've managed to do it. This is a proof of concept, it can certainly be optimized:

function sort_attributes(a, b) {
  if( a.name == b.name) {
    return 0;
  }

  return (a.name < b.name) ? -1 : 1;
}

$("#original").find('*').each(function() {
  if (this.attributes.length > 1) {
    var attributes = this.attributes;
    var list = [];

    for(var i =0; i < attributes.length; i++) {
      list.push(attributes[i]);
    }

    list.sort(sort_attributes);

    for(var i = 0; i < list.length; i++) {
      this.removeAttribute(list[i].name, list[i].value);
    }

    for(var i = 0; i < list.length; i++) {
      this.setAttribute(list[i].name, list[i].value);
    }
  }
});

Same thing for the second element of the diff, $('#different'). Now $('#original').html() and $('#different').html() show HTML code with attributes in the same order.

kaissun
  • 2,680
  • 2
  • 17
  • 33
Julien
  • 5,539
  • 4
  • 35
  • 58
  • 59
    What is the need for this? – rahul Oct 20 '10 at 04:21
  • 7
    What exactly do you mean? Once the HTML is parsed, there is no such thing as "order" of attributes. – casablanca Oct 20 '10 at 04:22
  • 40
    @rahul: actually there is a pretty interesting need for this: it can greatly improve the gzip compression of your pages. – haylem Oct 20 '10 at 04:32
  • 11
    ah, in Javascript... so much for compression. No idea what the need is then. – haylem Oct 20 '10 at 04:33
  • 13
    @Julien: By the time your JavaScript code runs, the page has already been sent to the client. I don't see how it can help in compression then. – casablanca Oct 20 '10 at 04:36
  • 3
    Could it be that the question is : is there a way to iterate over the attributes of a DOM object, in javascript, in a predictable order ? – phtrivier Oct 20 '10 at 09:21
  • 2
    I suppose order could be important when comparing text output? I.e. helps avoid false positives purely because order's changed? Ah the joys of speculation :-) – Brian Oct 20 '10 at 10:17
  • 5
    Was this question up-voted because of the funny answers, or are there really a bunch of users who wan to know how to do the impossible? – mikerobi Oct 20 '10 at 14:51
  • 2
    @phtrivier, @haylem: Server-side JavaScript certainly exists and is becoming more popular by the day. Then again, if that was the case here it probably would have been mentioned. – Matt Kantor Oct 20 '10 at 15:28
  • 1
    Why does this question have so many up-votes? It's like the mob just showed up. – orokusaki Oct 20 '10 at 15:56
  • @haylem - how exactly are you going to fzip a page after the browser already has downloaded it? If the answer is, "He's talking about using JavaScript in when the coder writes the code to save it.", my reply would be, "Do this on the server, not in JavaScript.". – orokusaki Oct 20 '10 at 15:57
  • @orokusaki: if you had read the comments you'd have noticed I had corrected myself. – haylem Oct 20 '10 at 16:04
  • 22
    There's actually a valid use for trying to do what the OP asks. Using a WYSIWYG editor to drive a wiki. The project I'm working on does exactly that, and the editor would reverse the order of attributes every time you edited the wiki, resulting in unnecessary diffs. I ending up alphabetically sorting attributes in the submitted HTML on the backend before saving to avoid diffs; could have just as easily done that sort in javascript before submitting. – Frank Farmer Oct 20 '10 at 16:57
  • Couldn't you run javascript server side to do this to improve gzip compression before the page is sent? I thought gzip compression over http was common, but I'm not a web developer. – Joseph Garvin Oct 20 '10 at 18:43
  • 5
    I really do not understand how this question is less than a day old and is approaching 50K views. – JD Isaacks Oct 20 '10 at 20:22
  • @Julien your solution only works so long as a browser returns attributes in the order they were added. There's no specification that insists they do that. – Pointy Oct 21 '10 at 09:15
  • 1
    This process is called [canonicalization](http://en.wikipedia.org/wiki/Canonicalization). – Jordão Oct 21 '10 at 16:29
  • @haylem - my apologies. The funny thing is how many people up-voted the first comment. The even funnier part is how I called gzip "fzip" in my comment... I guess I can't bash mistakes while making my own mistakes :) – orokusaki Oct 21 '10 at 17:10
  • @orokusaki: no pb. The second comment wasn't addressed to you so you just missed it. I guess people up-voted my initial comment because they did the same mistake as me and didn't notice we were talking about doing it in JS on the client side, so they thought it made sense (see the Recommendations part in this section: http://code.google.com/speed/page-speed/docs/payload.html#GzipCompression). – haylem Oct 21 '10 at 20:15
  • At this time there is a fun tool for code readability here: https://unminify.com and another one here: https://htmlformatter.com – Jeff Clayton Dec 23 '19 at 19:31

8 Answers8

68

JavaScript doesn't actually see a web page in the form of text-based HTML, but rather as a tree structure known as the DOM, or Document Object Model. The order of HTML element attributes in the DOM is not defined (in fact, as Svend comments, they're not even part of the DOM), so the idea of sorting them at the point where JavaScript runs is irrelevant.

I can only guess what you're trying to achieve. If you're trying to do this to improve JavaScript/page performance, most HTML document renderers already presumably put a lot of effort into optimising attribute access, so there's little to be gained there.

If you're trying to order attributes to make gzip compression of pages more effective as they're sent over the wire, understand that JavaScript runs after that point in time. Instead, you may want to look at things that run server-side instead, though it's probably more trouble than it's worth.

Tung Nguyen
  • 11,019
  • 4
  • 17
  • 10
  • 8
    JavaScript can run server-side. – Matt Kantor Oct 20 '10 at 15:30
  • Attributes are not considered part of the document tree (which uses ordering naturally). So while Attr inherits the Node interface, DOM Core 2 specifies these fields to be null for attributes http://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-637646024 – Svend Oct 20 '10 at 17:37
35

Take the HTML and parse into a DOM structure. Then take the DOM structure, and write it back out to HTML. While writing, sort the attributes using any stable sort. Your HTML will now be normalized with regard to attributes.

This is a general way to normalize things. (parse non-normalized data, then write it back out in normalized form).

I'm not sure why you'd want to Normalize HTML, but there you have it. Data is data. ;-)

Kim Bruning
  • 251
  • 2
  • 3
12

This is a proof of concept, it can certainly be optimized:

function sort_attributes(a, b) {
  if( a.name == b.name) {
    return 0;
  }

  return (a.name < b.name) ? -1 : 1;
 }

$("#original").find('*').each(function() {
  if (this.attributes.length > 1) {
    var attributes = this.attributes;
    var list = [];

    for(var i =0; i < attributes.length; i++) {
      list.push(attributes[i]);
    }

     list.sort(sort_attributes);

    for(var i = 0; i < list.length; i++) {
      this.removeAttribute(list[i].name, list[i].value);
    }

     for(var i = 0; i < list.length; i++) {
       this.setAttribute(list[i].name, list[i].value);
    }
  }
 });

Same thing for the second element of the diff, $('#different'). Now $('#original').html() and $('#different').html() show HTML code with attributes in the same order.

Julien
  • 5,539
  • 4
  • 35
  • 58
  • I think better if you generate your html contents in XML and then render it using xslt. You will surely get nicer output. – Nasaralla Oct 06 '11 at 14:41
8

you can try open HTML tab in firebug, the attributes are always in same order

tsurahman
  • 1,762
  • 5
  • 16
  • 25
  • 4
    This isn't really helpful on its own. That's because it is re-creating the HTML from the DOM, and however this happens has a particular attribute iteration order (or Firebug sorts them manually). Julien could take advantage of this and use the same method to write out HTML. – Matt Kantor Oct 20 '10 at 15:32
5

Actually, I can think of a few good reasons. One would be comparison for identity matching and for use with 'diff' type tools where it is quite annoying that semantically equivalent lines can be marked as "different".

The real question is "Why in Javascript"?

This question "smells" of "I have a problem and I think I have an answer...but I have a problem with my answer, too."

If the OP would explain why they want to do this, their chances of getting a good answer would go up dramatically.

Snowhare
  • 954
  • 5
  • 7
2

The question "What is the need for this?" Answer: It makes the code more readable and easier to understand.

Why most UI sucks... Many programmers fail to understand the need for simplifying the users job. In this case, the users job is reading and understanding the code. One reason to order the attributes is for the human who has to debug and maintain the code. An ordered list, which the program becomes familiar with, makes his job easier. He can more quickly find attributes, or realize which attributes are missing, and more quickly change attribute values.

signedbit
  • 11
  • 1
  • Methinks you have not thought about the question for long enough; even a working solution to the question would not address what you say here, true though it may be. – issa marie tseng Oct 20 '10 at 19:24
  • Why do you suppose that the OP would want to do this with Javascript? It's *possible* that a server-side (build time?) Javascript solution was in mind, but it's unlikely that somebody experienced enough to do that would have failed to mention it in a Stackoverflow post. It's also possible that the OP is implementing an in-browser HTML editor, but that also seems doubtful. – Pointy Oct 20 '10 at 22:27
0

This only matters when someone is reading the source, so for me it's semantic attributes first, less semantic ones next...

There are exceptions of course, if you have for example consecutive <li>'s, all with one attribute on each and others only on some, you may want to ensure the shared ones are all at the start, followed by individual ones, eg.

<li a="x">A</li>
<li a="y" b="t">B</li>
<li a="z">C</li>

(Even if the "b" attribute is more semantically useful than "a")

You get the idea.

Ali
  • 934
  • 2
  • 11
  • 17
0

it is actually possible, I think, if the html contents are passed as xml and rendered through xslt... therefore your original content in XML can be in whatever order you want.

Nasaralla
  • 1,829
  • 13
  • 11