Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 2026 |
Missing cells | 71 |
Missing cells (%) | 0.9% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 267.2 KiB |
Average record size in memory | 135.1 B |
Variable types
Categorical | 1 |
---|---|
Numeric | 2 |
DateTime | 1 |
DATE has a high cardinality: 2026 distinct values | High cardinality |
SP500 has 70 (3.5%) missing values | Missing |
DATE is uniformly distributed | Uniform |
DATE has unique values | Unique |
Date has unique values | Unique |
% 1-Day Return has 71 (3.5%) zeros | Zeros |
Reproduction
Analysis started | 2022-03-13 23:55:08.555107 |
---|---|
Analysis finished | 2022-03-13 23:55:09.433427 |
Duration | 0.88 seconds |
Software version | pandas-profiling v3.1.0 |
Download configuration | config.json |
Distinct | 2026 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 132.7 KiB |
2014-06-06 | 1 |
---|---|
2019-08-02 | 1 |
2019-08-21 | 1 |
2019-08-20 | 1 |
2019-08-19 | 1 |
Other values (2021) |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 0 |
---|---|
Distinct characters | 0 |
Distinct categories | 0 ? |
Distinct scripts | 0 ? |
Distinct blocks | 0 ? |
Unique
Unique | 2026 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 2014-06-06 |
---|---|
2nd row | 2014-06-09 |
3rd row | 2014-06-10 |
4th row | 2014-06-11 |
5th row | 2014-06-12 |
Common Values
Value | Count | Frequency (%) |
2014-06-06 | 1 | < 0.1% |
2019-08-02 | 1 | < 0.1% |
2019-08-21 | 1 | < 0.1% |
2019-08-20 | 1 | < 0.1% |
2019-08-19 | 1 | < 0.1% |
2019-08-16 | 1 | < 0.1% |
2019-08-15 | 1 | < 0.1% |
2019-08-14 | 1 | < 0.1% |
2019-08-13 | 1 | < 0.1% |
2019-08-12 | 1 | < 0.1% |
Other values (2016) | 2016 |
Length
Value | Count | Frequency (%) |
2014-06-06 | 1 | < 0.1% |
2014-07-02 | 1 | < 0.1% |
2014-06-13 | 1 | < 0.1% |
2014-06-16 | 1 | < 0.1% |
2014-06-17 | 1 | < 0.1% |
2014-06-18 | 1 | < 0.1% |
2014-06-19 | 1 | < 0.1% |
2014-06-20 | 1 | < 0.1% |
2014-06-23 | 1 | < 0.1% |
2014-06-24 | 1 | < 0.1% |
Other values (2016) | 2016 |
Most occurring characters
Value | Count | Frequency (%) |
No values found. |
Most occurring categories
Value | Count | Frequency (%) |
No values found. |
Most frequent character per category
Most occurring scripts
Value | Count | Frequency (%) |
No values found. |
Most frequent character per script
Most occurring blocks
Value | Count | Frequency (%) |
No values found. |
Most frequent character per block
Distinct | 1945 |
---|---|
Distinct (%) | 99.4% |
Missing | 70 |
Missing (%) | 3.5% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2801.088016 |
Minimum | 1829.08 |
---|---|
Maximum | 4796.56 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 16.0 KiB |
Quantile statistics
Minimum | 1829.08 |
---|---|
5-th percentile | 1961.0275 |
Q1 | 2108.905 |
median | 2670.025 |
Q3 | 3130.0375 |
95-th percentile | 4456.255 |
Maximum | 4796.56 |
Range | 2967.48 |
Interquartile range (IQR) | 1021.1325 |
Descriptive statistics
Standard deviation | 776.9689512 |
---|---|
Coefficient of variation (CV) | 0.2773811271 |
Kurtosis | -0.05808348742 |
Mean | 2801.088016 |
Median Absolute Deviation (MAD) | 548.6 |
Skewness | 0.9570039718 |
Sum | 5478928.16 |
Variance | 603680.7511 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2066.66 | 2 | 0.1% |
2439.07 | 2 | 0.1% |
2783.02 | 2 | 0.1% |
2268.9 | 2 | 0.1% |
2926.46 | 2 | 0.1% |
2373.47 | 2 | 0.1% |
2095.84 | 2 | 0.1% |
2102.31 | 2 | 0.1% |
2080.15 | 2 | 0.1% |
2723.06 | 2 | 0.1% |
Other values (1935) | 1936 | |
(Missing) | 70 | 3.5% |
Value | Count | Frequency (%) |
1829.08 | 1 | |
1851.86 | 1 | |
1852.21 | 1 | |
1853.44 | 1 | |
1859.33 | 1 | |
1862.49 | 1 | |
1862.76 | 1 | |
1864.78 | 1 | |
1867.61 | 1 | |
1868.99 | 1 |
Value | Count | Frequency (%) |
4796.56 | 1 | |
4793.54 | 1 | |
4793.06 | 1 | |
4791.19 | 1 | |
4786.35 | 1 | |
4778.73 | 1 | |
4766.18 | 1 | |
4726.35 | 1 | |
4725.79 | 1 | |
4713.07 | 1 |
Distinct | 1955 |
---|---|
Distinct (%) | 96.5% |
Missing | 1 |
Missing (%) | < 0.1% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.04397695168 |
Minimum | -11.98405028 |
---|---|
Maximum | 9.38276571 |
Zeros | 71 |
Zeros (%) | 3.5% |
Negative | 892 |
Negative (%) | 44.0% |
Memory size | 16.0 KiB |
Quantile statistics
Minimum | -11.98405028 |
---|---|
5-th percentile | -1.630657564 |
Q1 | -0.309931843 |
median | 0.03426822371 |
Q3 | 0.5022110835 |
95-th percentile | 1.468397098 |
Maximum | 9.38276571 |
Range | 21.36681599 |
Interquartile range (IQR) | 0.8121429265 |
Descriptive statistics
Standard deviation | 1.094093047 |
---|---|
Coefficient of variation (CV) | 24.8787832 |
Kurtosis | 19.61983268 |
Mean | 0.04397695168 |
Median Absolute Deviation (MAD) | 0.4114931522 |
Skewness | -0.6434118056 |
Sum | 89.05332716 |
Variance | 1.197039595 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 71 | 3.5% |
1.301700683 | 1 | < 0.1% |
-0.0506081527 | 1 | < 0.1% |
0.8246825558 | 1 | < 0.1% |
-0.7914764079 | 1 | < 0.1% |
1.210587535 | 1 | < 0.1% |
1.442618345 | 1 | < 0.1% |
0.2464268112 | 1 | < 0.1% |
-2.929276361 | 1 | < 0.1% |
1.476202861 | 1 | < 0.1% |
Other values (1945) | 1945 |
Value | Count | Frequency (%) |
-11.98405028 | 1 | |
-9.511268047 | 1 | |
-7.596968076 | 1 | |
-5.894412157 | 1 | |
-5.183082331 | 1 | |
-4.886841092 | 1 | |
-4.416327867 | 1 | |
-4.414239783 | 1 | |
-4.335952253 | 1 | |
-4.097924428 | 1 |
Value | Count | Frequency (%) |
9.38276571 | 1 | |
9.287119453 | 1 | |
7.033130412 | 1 | |
6.241416084 | 1 | |
5.995482224 | 1 | |
4.959380715 | 1 | |
4.939633578 | 1 | |
4.603922524 | 1 | |
4.220259242 | 1 | |
3.90338454 | 1 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
DATE | SP500 | % 1-Day Return | Date | |
---|---|---|---|---|
0 | 2014-06-06 | 1949.44 | NaN | 2014-06-06 |
1 | 2014-06-09 | 1951.27 | 0.093873 | 2014-06-09 |
2 | 2014-06-10 | 1950.79 | -0.024599 | 2014-06-10 |
3 | 2014-06-11 | 1943.89 | -0.353703 | 2014-06-11 |
4 | 2014-06-12 | 1930.11 | -0.708888 | 2014-06-12 |
5 | 2014-06-13 | 1936.16 | 0.313454 | 2014-06-13 |
6 | 2014-06-16 | 1937.78 | 0.083671 | 2014-06-16 |
7 | 2014-06-17 | 1941.99 | 0.217259 | 2014-06-17 |
8 | 2014-06-18 | 1956.98 | 0.771889 | 2014-06-18 |
9 | 2014-06-19 | 1959.48 | 0.127748 | 2014-06-19 |
Last rows
DATE | SP500 | % 1-Day Return | Date | |
---|---|---|---|---|
2016 | 2022-02-28 | 4373.94 | -0.244261 | 2022-02-28 |
2017 | 2022-03-01 | 4306.26 | -1.547346 | 2022-03-01 |
2018 | 2022-03-02 | 4386.54 | 1.864263 | 2022-03-02 |
2019 | 2022-03-03 | 4363.49 | -0.525471 | 2022-03-03 |
2020 | 2022-03-04 | 4328.87 | -0.793402 | 2022-03-04 |
2021 | 2022-03-07 | 4201.09 | -2.951810 | 2022-03-07 |
2022 | 2022-03-08 | 4170.70 | -0.723384 | 2022-03-08 |
2023 | 2022-03-09 | 4277.88 | 2.569832 | 2022-03-09 |
2024 | 2022-03-10 | 4259.52 | -0.429185 | 2022-03-10 |
2025 | 2022-03-11 | 4204.31 | -1.296155 | 2022-03-11 |