Fastest way of creating and sorting the timestamp data with Python?
Lets say I will have two arrays. The first row would specify the timestamp and 2nd row would be data.
timeStamp = ['0001','0002','0003',...,'9999'] data = [6234,2372,1251,...,5172]
What would be the best way to store them? And let’s say I would like to sort the data from smallest to bigger number with keeping their timestamp values attached to them?
Multiple ways of doing this. Let’s take the following data –
timeStamp = [9,1,2,3,9999] data = [1245, 6234,2372,1251,5172]
Using base python and zip
The default way of handling data, specifically lists.
zip method allows you to quite literally zip two or more lists element-wise, creating a list of tuples. You can then use
sorted with a lamda function that sorts the combined lists by specific position of the element.
l = zip(timeStamp, data) #storing 2 arrays by attaching them elementwise print(sorted(l, key=lambda x: x))
[(1, 6234), (2, 2372), (3, 1251), (9, 1245), (9999, 5172)]
Using numpy and argsort
Numpy allows you to work with multidimensional arrays. For 2 lists, you can simply
np.stack them together to create a 2D array.
In order to sort, you can use
argsort() on the first column (timestamp) which returns the indexes of the sorted ordered column. Then you can use these indexes to index the original 2D array to get the sorted order for the array by Timestamps.
arr = np.stack([timeStamp, data]) arr[:,arr.argsort()]
array([[ 1, 2, 3, 9, 9999], [6234, 2372, 1251, 1245, 5172]])
Using pandas datafames and sort_values
Finally, best way to work on multiple lists in conjunction is to consider them as columns in a DataFrame. Pandas provides a handy framework to work with column/row arranged data which in this case is very useful as you can also use column names to identify each array/column.
sort_values allows you to quickly sort the complete data based on the column name.
import pandas as pd df = pd.DataFrame(zip(timeStamp, data), columns=['timeStamp','data']) print(df.sort_values('timeStamp'))
timeStamp data 1 1 6234 2 2 2372 3 3 1251 0 9 1245 4 9999 5172
Answered By – Akshay Sehgal