My dataset consists of time series data for stock prices. However, the data is not continuously distributed, with some days missing. I am looking for an efficient way to fill in these missing days with the previous values.
In total, there are more than a thousand markets with over 1000 days of data. Each day contains timestamps and pricing information.
One of the tasks I want to perform is to retrieve the prices for every market on a specific day and then sort them.
I am uncertain whether it would be more beneficial to populate all the missing days by inserting timestamps and prices where necessary. This method may increase memory usage, but it could improve data access speed. I am not sure about the exact memory requirements, but with 1000 markets, 1000 days, and 6 object keys, if a number occupies 8 bytes, the total memory needed would be around 43MB plus any additional overheads from objects.
Alternatively, another approach could be to find the previous match, which would involve searching through each array for the matching timestamp. This method might be computationally expensive if carried out for every day of the dataset.
Explore a sample dataset on JSBin
let data = {
"name": "MARKET",
"values": [{
"time": 1440338400000,
"close": 0.142163,
"high": 0.152869,
"low": 0.142163,
"open": 0.152221,
"volume": 14.2163,
"marketCap": 0
},
// Remaining dataset entries...
]
};
data.values.forEach(value => {
let date = new Date(value.time)
console.log(date.toString())
})