Streams

Streaming Any Time Data Using JSON Published To GitHub

How can you stream any time data?

A lot of attention is spent on streaming data always being about real-time data. Meaning data that was just generated, created, updated, or some other recently occurring event. Real time data definitely dominates the landscape when it comes to streaming APIs, but it is something that can be easily applied to any time series, historical, and archived data. We’ve played around with many different ways of accomplishing this, but also enjoy providing very low-cost, scrappy, and innovative ways of doing this with the existing tools API developers are already using.

One of the ways we’ve been delivering any time data via streaming APIs is using GitHub. GitHub repositories provide a great way to publicly or privately publish, share, and collaborate around JSON data sources. Using Streamdata.io, you can proxy any JSON file on GitHub, treating it just like you would any other JSON API. The difference is the JSON file on GitHub will just be static, something which you can make more dynamic using the GitHub API. Trickling in updates to the JSON, at a desired interval, on the scheduled desired to create exactly the streaming experience desired. Dishing up static JSON files that are stored on GitHub using Server-Sent Events (SSE), and controlling the stream using the GitHub API.

Why would you want to do this? This approach to using GitHub as a streaming data store would allow for the delivery of ephemeral streams of data from any time in the past, from any scenario, recreating a specific experience as a fresh stream of data. Allowing API providers to replay specific periods of times, historic events, and other situations that have already past, but delivered as if they were happening in the moment. Taking the legacy data from a database and POSTing to the GitHub API as incremental updates, on a specific schedule, pushing it to the real-time stream through a single, regularly updated static JSON file. Mimicking what dynamic API would normally do, but doing it as a more temporary and more controlled publishing scenario that can still easily be streamed using Streamdata.io.

With this approach to streaming historic data, is meant to just be an innovative approach to delivering streams of data beyond what we are used to. We like using GitHub, and the GitHub API for these types of projects, because it is a tool that people are familiar with and are already using. Hopefully it lights up your imagination when it comes to streaming data, showing that historic data can also be streamed, replaying meaningful moments from the past, providing us with entirely new ways of providing access to, and even monetizing our legacy data assets. Expanding the streaming data horizon beyond what we are used to, revealing the opportunity that exists across the data landscape.

AI in Finance White paper - common data types

**Original source: streamdata.io blog