Links
Tags
apache
armenia
books
bsd
c
c++
chips
cinema
concurrency
cooking
database
dragonfly
erlang
filesystem
freebsd
fun
hardware
java
javascript
json
languages
linux
lyric
mac_osx
mail
math
misc
music
personal
poems
presentation
programming
python
references
ruby
rubyjs
scm
software
spiking_neural_net
study
sysadm
sysarch
technology
testing
travel
virtualization
web
wee
windows
Lately I had an idea about lazily parsing JSON. Parsing JSON comes along with building a data structure in the host language (e.g. Ruby). For larger JSON documents this becomes expensive, especially if you’re only using a few values of the JSON. To do this efficiently, it’s important to specify forward skips as in the following example:
{/*20*/
a: "test",
b: "abc",
}
The 20 here means that the closing ”}” is 20 bytes later, so that the JSON parser can skip 20 bytes (after writing down the current location) and continue parsing. It’d create a special Hash object, which would lazily parse the inner JSON upon access.
Of course this requires that the JSON document is kept available (either in memory or on disk). Even for a large JSON document, the memory space to keep it in memory is usually far less than the memory used for all the Ruby values.
Another idea I had was that of a streaming JSON parser, similar to what exists for XML (SAX or expat). This would allow for very (space) efficient extraction of values out of a JSON document (of any size). Well, maybe I’ll rewrite my C++ JSON parser into a streaming parser one day.