I do miss Pokemon. ANYWAYS, onto the meat of the post. Last time, I spoke about organising data into JSON objects that I could work with. I also told you I was only doing that so I could understand JSON before moving onto GeoJSON. Well. Guess what. GeoJSON IS JUST A JSON OBJECT WITH A PARTICULAR STRUCTURE. Talk about an anticlimax.

HERE is the final code for scraping HTML and storing it as JSON objects. And HERE is the new code for doing the same thing but storing it as a GeoJSON object. For a detailed breakdown of what a GeoJSON object looks like, go HERE. If you can’t be bothered to read through it and figure it out, usually I would tell you to leave me alone. But I’m in a giving mood so I’m gonna explain it.

  • A GeoJSON object is ONE JSON object.
  • It is made up of TWO properties:
    • A “features” property
    • A “type” property
  • The “type” property is always set to “FeatureCollection” (side not: I’m not sure how important capitalisation is in JSON, but err on the side of caution until you find out)
  • The “features” property is always an array containing objects that represent your geographical data
  • Each object representing your geographical data contains three properties:
    • A “type” property
    • A “geometry” property
    • A “properties” property
  • The “type” property is always set to “Feature” as that is what each geographical object is called
  • The “geometry” property always contains an object that represents your exact geographical data and has two properties:
    • A “type” property that defines the type of data (point, multipoint, coordinates, etc.)
    • A “coordinates” property that contains the numbers representing your geographical data. It is either an array of numbers or an array of arrays of numbers. Easy on the brain and tongue, right?
  • The “properties” property represents an object that contains any properties you want this specific feature to have a la typical JSON fashion

DAS IT. Again, look at the spec page I linked to for more detailed information. If you look at my code, all I’m doing is translating the above into computer speak. It feels awkward at first because you’re creating an object that represents a property and an array, and the array represents objects that represent a property, an object that represents more properties and arrays, and an object that represents even more properties.

Please go away.


Sooo, if you read this post, you’ll know that I scraped some subway information off the Internets. Now that I have the data, I need to organise it all pretty like. How do I do this? JSON!

Why JSON, you ask? Decent question. I chose JSON because I want to use GeoJSON for a project. Because I don’t know enough about GeoJSON yet, I’m just gonna store the data as a JSON object for now and then convert it later when I’m more comfortable in these JSON waterfalls.

SCI, SLOW YOUR HORSES DOWN AND EXPLAIN WHAT THE HECK JSON IS IN THE FIRST PLACE. Oof, fiiiineee. So touchy. Let’s go to the Internet’s Hal 9000 for the answer:

JSON, or JavaScript Object Notation, is a text-based open standard designed for human-readable data interchange.

In layman’s terms, it’s the JavaScript way to represent objects. Yippee. I’m too lazy to go through an example explaining the structure, so Google if you’re BOVVERED.

Now. From my previous post, I have a list of station objects. Each object contains various buckets of data, but I only need three parts: station name, latitude, and longitude. So that’s exactly what I did in my code – extract the desired data, store it as a JSON object, and add that JSON object to a JSON array. I’m sure you want to see the code that does this. And I really want to show it to you. But I just realised that if I update the gist from the previous post it might make the previous post nonsensical. ALL THIS because WordPress doesn’t provide code formatting. Sigh. Here’s the new JSON specific code in all its ugly format glory…NEVERMIND. I have updated the gist and it turns out I can link to specific revision. THANK YOU GITHUB. Go HERE for the code. Let me break down the new parts:

  • First of all, I used the JSON.simple library to do all of this. Go HERE to download the jar. I also used mkywong’s tutorial to figure out what goes where.
  • OKAY. So. I created a JSONArray object called stations. This is where I’m going to store each station object.
  • NEXT. Every time I looped through a loop in the table with the station data, I created a new JSONObject object and used it to store the station name, latitude, and longitude. I then added this station object to the stations array.
  • Lastly, I created a FileWriter object and a File object and wrote my stations array to the file by using the toJSONString( ) method from the JSON.simple library. This converts your JSONObject or JSONArray into readable JSON format.


I really hate the template I’m using.