I wrote a script to generate and analyze XML, YAML, and REBOL file sizes—with different depth values. The structure I used is pretty simple:
<?xml version="1.0"?>
<node>
<node>
<node>
...
<node>value</node>
...
</node>
</node>
</node>Apart from being inflexible, the reason why indentation (YAML block structure) is bad is pretty obvious.
Of course, to get a better idea, I’d have to analyze various sample structures. Nonetheless, what I’m trying to say is that there’s definitely better ways to represent semantic data. File size becomes important when it comes to serving feeds—RSS, et al.—bandwidth is a major concern.
Consider the following RSS 2.0 file:
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>My Blog</title>
<link>http://myblog.com/blog</link>
<description>My life</description>
<language>en-us</language>
<pubDate>Tue, 01 Jan 2004 04:00:00 GMT</pubDate>
<lastBuildDate>Tue, 01 Jan 2004 21:30:00 GMT</lastBuildDate>
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
<generator>My Generator</generator>
<managingEditor>me@myblog.com</managingEditor>
<webMaster>webmaster@myblog.com</webMaster>
<item>
<title>Hotel Foo</title>
<link>http://myblog.com/archives/2005/hotel-foo.html</link>
<description>I love hotel foo</description>
<pubDate>Tue, 01 Feb 2005 09:50:00 GMT</pubDate>
<guid>http://myblog.com/archives/2005/hotel-foo.html</guid>
</item>
</channel>
</rss>
It’s YAML equivalent would be:
channel: title: My Blog link: http://myblog.com/blog description: My life language: en-us pubDate: Tue, 01 Jan 2004 04:00:00 GMT lastBuildDate: Tue, 01 Jan 2004 21:30:00 GMT generator: My Generator -item: title: Hotel Foo link: http://myblog.com/archives/2005/hotel-foo.html description: I love Hotel Foo pubDate: Tue, 01 Feb 2005 09:50:00 GMT guid: http://myblog.com/archives/2005/hotel-foo.html
In its least verbose form (eliminating unnecessary whitespaces), the XML format takes little more than 70% more space than its YAML equivalent. That’s a big difference.
I have started using YAML for certain work-related activities. If I find Time (which is extremely difficult), I’ll try to dig into this some more.
REBOL link
(Anonymous)
2005-02-01 06:56 pm (UTC)
BTW, looks like XML wins when depth is big, right?
JJ
Re: REBOL link
2005-02-01 07:00 pm (UTC)
Of course, this is _not_ a thorough analysis—I have considered only one structure.
Re: REBOL link
(Anonymous)
2005-02-01 08:33 pm (UTC)
Regardless. AS some other poster mentioned, if you are looking for small downloads -- just use gzip.
Re: REBOL link
2005-02-02 06:26 am (UTC)
2005-02-01 08:15 pm (UTC)
Your RSS feed should be gzip compressed.
Lose Weight, Save Money with Compression!
Leknor.com - Code - gziped?
2005-02-02 03:09 pm (UTC)
illegal yaml
(Anonymous)
2005-02-01 08:42 pm (UTC)
illegal yaml
(Anonymous)
2005-02-01 08:44 pm (UTC)
Re: illegal yaml
2005-02-02 06:21 am (UTC)
Re: illegal yaml
2005-02-15 02:56 am (UTC)
Re: illegal yaml
2005-02-15 03:52 am (UTC)