NASA publishes a dataset of over 34,000 meteorite landings as recorded by The Meteoritical Society. Kevin shows you how Directus can be used to further explore, understand, and remix this dataset both in the Data Studio and via API.
Speaker 0: In this show, we give new life to open datasets with the help of directors. Join me as we explore, analyze, and generate APIs to improve access to and democratize data. Before we get started, if you have or know of an interesting open dataset and want us to feature it, just reach out to our team and it may well feature in a future episode. Today's dataset was published by NASA and contains data on over 34,000 meteorites from the meteoritical society. It contains information about their classification, where and when they were observed, and whether they fell or were found.
Now just a quick note before we continue, I later, after setting everything up, learnt about the difference between a meteor and a meteorite. Basically, it's a meteor when it's in the air, and when it survives its journey down to Earth and it lands, it becomes a meteorite. This is a dataset of meteorites, but I'm going to, for ease, just refer to them as meteors throughout. I just I want you to know that I know the difference now and so do you. So these are meteorites.
Right? But we'll refer to them as meteors from here. Now there's only a couple of very small processing steps that I made to this data before I imported it in Directus. Firstly, I converted coordinates to GeoJSON objects, which allows us to use all of the mapping features inside of Directus. Secondly, there was one data point in here which was incorrectly entered.
It said that a a meteor fell at, what was the year? 2101, and it was 2010. I googled it, and I fixed it in the dataset because it was ruining all of the graphs. So just to note there that there was a small error in the data that I fixed, and I changed the format of just the mapping and location data so it works following the GeoJSON standard. Now before we jump in and see what we can now do with this data, just to note that this specific dataset just stops in 2013.
And as we can see here in the web page, there is a new dataset on a new URL, but that URL doesn't resolve. So I just use this dataset. Just know that it stops in 2013, but I'm sure it could be updated if you were to follow along at home. This is the director's data studio. It's a web application that lets you explore, analyze, and work with data in a database.
And as you can see, all 38,000 items from that dataset have been imported. Now straight away, we have access to everything Directus offers. So we can filter these items perhaps by the year that they they were discovered or whether they fell or were found, stuff like that. We can step in to a single item like this one here and see a bit more data around it. Here we can see the weight of it or the mass of it rather, when it was found.
And just to be clear, this is a date field, but the dataset only has the year. So the year is the only thing that actually matters here. Whether or not it was, it fell or it was found, it's class, and so on. And because we encoded the coordinates as a GeoJSON point, we know the exact point that this coordinate resolves to. So we can zoom out there and see that on a map.
Now this is a pretty standard way to look at data. It's a table, but directors does also have other layouts including a map. And as these data points are encoded, as GeoJSON, as I mentioned, we now get to use the map layout, which I think is super interesting. So you can see a concentration here perhaps, of data that is stored in this dataset. So I just think that's really fascinating.
There's obviously a lot more we can do with this data inside of directors, but what I wanna draw your attention to is the insights module. So this is an insights dashboard for this dataset here that I prepared earlier, and it it just allows us to analyse this data. So I've added a few panels here. Firstly, I've added this start year and end year input so we can change this. You know, we can change this and all of these other fields are hooked into these two values.
To show you how that works, inside of this panel, we add a filter and we say the year, must be greater than or equal to this 1st year and less than or equal to the end year. You can apply other filters, of course, in here manually, like hard coded or using dynamic values from text boxes. So in here, we also see, you know, this, this graph that I've put in showing all of the meteors in the dataset for these given years. The average size in the time frame. So we see that there was a kind of peak in size, I suppose, of of, meteors in 2004 and back up in 2013.
I don't know if that means anything, but you can discover it, I suppose. You can also look at, you know, whether found or fell. So we see there was just 1, 1 meteor in this dataset that fell and was observed to have fallen versus having later been found. The classes, of those meteors so we see in this time frame here, a majority of meteors were l six class. And here is a list of meteors in this criteria here.
They're just alphabetized, so we can step into those and start to look in a little more detail. So as you saw, we can use the director's data studio to explore and query data using the web application. But every director's project also comes with a set of auto generated APIs to do the same thing. And this means that developers can not only interact with that data using APIs, but can build applications on top of those APIs to increase its usage, remix it, and encourage citizen participation. Now these APIs can also be used to create or update data, and they can be locked down in a number of ways.
So they can be public, but they might also require registration or specific permissions. Now I've prepared some requests to show you so you can get an idea of how querying data works using the REST API. But, of course, there are API endpoints as well for updating or creating new data as well. Let's jump into postman and see how it works. This end point will fetch all items from the meteors collection.
It will return an array of objects where each object represents 1 meteor. Now by default, this will return a 100 items and you can either paginate or change the size of the response or both. And realistically, you'll probably want to do both. And just a reminder that you can also lock down which fields are actually returned by directors to a public user or indeed any user with any role. So here we see the ID, the name, the class, the location encoded as a GeoJSON object, and the mass in grams.
Now we can also tack on any number of director's query parameters to change the data that comes back. So this example here has 2 query parameters. The first one as shown here and here will only return items with a mass that is greater than or equal to a 1,000. That's grams. So that's 1 kilo.
So it will never return items that are less than a 1,000 in this field. It will also sort the returned items starting with the biggest. So this is the biggest item on record found in 1920, and the next biggest was found in 18/18, and so on and so forth. So you can add any number of these filters, sorts, query parameters on, and you can add several, you can add several filters at the same time as well. Now this final example isn't about getting data itself, but it's about using our aggregation and group by query parameters to start to analyze the data.
And this is more or less how the director's insights module actually generates data in the panels. So here we're saying, get every item and group it by the year. So for every year in this data set, we get to see how many meteors exist. So there's 1 in this year, 1 in this year, and so on. But down in let's find an example here.
In 17, 68, there are 2 meteors in the dataset and so on and so forth. These are really powerful. This is obviously quite a straightforward example, but you can use this in many, many ways. There are a whole set of query parameters you can use, not just aggregation and grouping. We saw filter.
We saw sort, but there's also a search. There's stuff for pagination, aliases, functions you can use, especially around dates and more. And there are also really complex filter operators that you can use. So we just use greater than or equal to, and this is, you know, one of quite a standard set of operators for any time you're querying data. But ones that I find interesting are the intersects and the intersects bounding box.
These are only available when you have geospatial data. So you can set a box and say, is this point in the box or not, or only return items that are in the box. And so I'm sure you can start to imagine the applications you can build on top of this data along with the Directus APIs. So that's a little bit of an example on how to take data from NASA around meteorites that have fallen to Earth, and bring them into direct us to explore them, to analyze them, and of course, to use the auto generated APIs. Once again, if you have a dataset that you or you know of a dataset that you think would be interesting for this series, just get in touch, and maybe it will feature in a future episode.
But until then, bye for now.