So, Let’s Organise Our Web Scrape? HEY JSON.

Sooo, if you read this post, you’ll know that I scraped some subway information off the Internets. Now that I have the data, I need to organise it all pretty like. How do I do this? JSON!

Why JSON, you ask? Decent question. I chose JSON because I want to use GeoJSON for a project. Because I don’t know enough about GeoJSON yet, I’m just gonna store the data as a JSON object for now and then convert it later when I’m more comfortable in these JSON waterfalls.

SCI, SLOW YOUR HORSES DOWN AND EXPLAIN WHAT THE HECK JSON IS IN THE FIRST PLACE. Oof, fiiiineee. So touchy. Let’s go to the Internet’s Hal 9000 for the answer:

JSON, or JavaScript Object Notation, is a text-based open standard designed for human-readable data interchange.

In layman’s terms, it’s the JavaScript way to represent objects. Yippee. I’m too lazy to go through an example explaining the structure, so Google if you’re BOVVERED.

Now. From my previous post, I have a list of station objects. Each object contains various buckets of data, but I only need three parts: station name, latitude, and longitude. So that’s exactly what I did in my code – extract the desired data, store it as a JSON object, and add that JSON object to a JSON array. I’m sure you want to see the code that does this. And I really want to show it to you. But I just realised that if I update the gist from the previous post it might make the previous post nonsensical. ALL THIS because WordPress doesn’t provide code formatting. Sigh. Here’s the new JSON specific code in all its ugly format glory…NEVERMIND. I have updated the gist and it turns out I can link to specific revision. THANK YOU GITHUB. Go HERE for the code. Let me break down the new parts:

  • First of all, I used the JSON.simple library to do all of this. Go HERE to download the jar. I also used mkywong’s tutorial to figure out what goes where.
  • OKAY. So. I created a JSONArray object called stations. This is where I’m going to store each station object.
  • NEXT. Every time I looped through a loop in the table with the station data, I created a new JSONObject object and used it to store the station name, latitude, and longitude. I then added this station object to the stations array.
  • Lastly, I created a FileWriter object and a File object and wrote my stations array to the file by using the toJSONString( ) method from the JSON.simple library. This converts your JSONObject or JSONArray into readable JSON format.


I really hate the template I’m using.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: