Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 846 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 59.6 KiB |
Average record size in memory | 72.2 B |
Variable types
NUM | 7 |
---|---|
DATE | 1 |
CAT | 1 |
Reproduction
Analysis started | 2020-12-02 17:48:23.786745 |
---|---|
Analysis finished | 2020-12-02 17:48:32.003231 |
Duration | 8.22 seconds |
Software version | pandas-profiling v2.9.0 |
Download configuration | config.yaml |
Distinct | 846 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 6.6 KiB |
Minimum | 2020-10-11 00:00:00 |
---|---|
Maximum | 2020-11-30 23:00:00 |
Histogram with fixed size bins (bins=50)
Distinct | 846 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 6.6 KiB |
16/11/2020 16:00 | 1 |
---|---|
19/11/2020 23:00 | 1 |
19/11/2020 15:00 | 1 |
13/11/2020 01:00 | 1 |
20/11/2020 12:00 | 1 |
Other values (841) |
Value | Count | Frequency (%) | |
16/11/2020 16:00 | 1 | 0.1% | |
19/11/2020 23:00 | 1 | 0.1% | |
19/11/2020 15:00 | 1 | 0.1% | |
13/11/2020 01:00 | 1 | 0.1% | |
20/11/2020 12:00 | 1 | 0.1% | |
24/11/2020 03:00 | 1 | 0.1% | |
24/10/2020 06:00 | 1 | 0.1% | |
19/10/2020 21:00 | 1 | 0.1% | |
16/11/2020 10:00 | 1 | 0.1% | |
17/11/2020 10:00 | 1 | 0.1% | |
Other values (836) | 836 | 98.8% |
Frequencies of value counts
Unique
Unique | 846 ? |
---|---|
Unique (%) | 100.0% |
Histogram of lengths of the category
Length
Max length | 16 |
---|---|
Median length | 16 |
Mean length | 16 |
Min length | 16 |
NO2Arpat
Real number (ℝ≥0)
Distinct | 92 |
---|---|
Distinct (%) | 10.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 32.90661939 |
---|---|
Minimum | 1 |
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 8 |
Q1 | 18 |
median | 29 |
Q3 | 44 |
95-th percentile | 69.75 |
Maximum | 100 |
Range | 99 |
Interquartile range (IQR) | 26 |
Descriptive statistics
Standard deviation | 19.21265679 |
---|---|
Coefficient of variation (CV) | 0.5838538613 |
Kurtosis | 0.2337327023 |
Mean | 32.90661939 |
Median Absolute Deviation (MAD) | 12 |
Skewness | 0.7911693392 |
Sum | 27839 |
Variance | 369.126181 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
18 | 26 | 3.1% | |
22 | 23 | 2.7% | |
16 | 22 | 2.6% | |
24 | 22 | 2.6% | |
23 | 22 | 2.6% | |
41 | 21 | 2.5% | |
25 | 21 | 2.5% | |
28 | 21 | 2.5% | |
12 | 20 | 2.4% | |
26 | 20 | 2.4% | |
Other values (82) | 628 | 74.2% |
Value | Count | Frequency (%) | |
1 | 5 | 0.6% | |
2 | 1 | 0.1% | |
3 | 6 | 0.7% | |
4 | 6 | 0.7% | |
5 | 5 | 0.6% |
Value | Count | Frequency (%) | |
100 | 1 | 0.1% | |
98 | 1 | 0.1% | |
94 | 1 | 0.1% | |
92 | 2 | 0.2% | |
91 | 1 | 0.1% |
tair
Real number (ℝ≥0)
Distinct | 843 |
---|---|
Distinct (%) | 99.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 14.481751 |
---|---|
Minimum | 1.176315789 |
Maximum | 26.43589744 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 1.176315789 |
---|---|
5-th percentile | 5.916666667 |
Q1 | 11.1974359 |
median | 14.28471154 |
Q3 | 17.85675676 |
95-th percentile | 22.74381757 |
Maximum | 26.43589744 |
Range | 25.25958165 |
Interquartile range (IQR) | 6.659320859 |
Descriptive statistics
Standard deviation | 4.954726645 |
---|---|
Coefficient of variation (CV) | 0.3421358815 |
Kurtosis | -0.3295811056 |
Mean | 14.481751 |
Median Absolute Deviation (MAD) | 3.369230769 |
Skewness | -0.03068165098 |
Sum | 12251.56135 |
Variance | 24.54931612 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
15.26923077 | 2 | 0.2% | |
19.38205128 | 2 | 0.2% | |
10.82307692 | 2 | 0.2% | |
13.66666667 | 1 | 0.1% | |
3.05 | 1 | 0.1% | |
12.61 | 1 | 0.1% | |
5.07 | 1 | 0.1% | |
14.43513514 | 1 | 0.1% | |
17.93333333 | 1 | 0.1% | |
14.76111111 | 1 | 0.1% | |
Other values (833) | 833 | 98.5% |
Value | Count | Frequency (%) | |
1.176315789 | 1 | 0.1% | |
1.446153846 | 1 | 0.1% | |
1.702564103 | 1 | 0.1% | |
2.125 | 1 | 0.1% | |
2.1875 | 1 | 0.1% |
Value | Count | Frequency (%) | |
26.43589744 | 1 | 0.1% | |
26.25405405 | 1 | 0.1% | |
26.21538462 | 1 | 0.1% | |
26.15526316 | 1 | 0.1% | |
25.74871795 | 1 | 0.1% |
rad
Real number (ℝ≥0)
Distinct | 821 |
---|---|
Distinct (%) | 97.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 75.88133125 |
---|---|
Minimum | 19.33684211 |
Maximum | 99.9 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 19.33684211 |
---|---|
5-th percentile | 36.62845816 |
Q1 | 60.97348739 |
median | 82.33996795 |
Q3 | 93.71083333 |
95-th percentile | 99.3381891 |
Maximum | 99.9 |
Range | 80.56315789 |
Interquartile range (IQR) | 32.73734594 |
Descriptive statistics
Standard deviation | 20.45013106 |
---|---|
Coefficient of variation (CV) | 0.2695014798 |
Kurtosis | -0.5675791449 |
Mean | 75.88133125 |
Median Absolute Deviation (MAD) | 14.06887821 |
Skewness | -0.7123314644 |
Sum | 64195.60624 |
Variance | 418.2078604 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
99.9 | 19 | 2.2% | |
99.9 | 6 | 0.7% | |
97.4 | 2 | 0.2% | |
98.5775 | 2 | 0.2% | |
62.38888889 | 1 | 0.1% | |
95.3425 | 1 | 0.1% | |
99.3 | 1 | 0.1% | |
91.81025641 | 1 | 0.1% | |
50.18461538 | 1 | 0.1% | |
90.63076923 | 1 | 0.1% | |
Other values (811) | 811 | 95.9% |
Value | Count | Frequency (%) | |
19.33684211 | 1 | 0.1% | |
19.6 | 1 | 0.1% | |
20.2125 | 1 | 0.1% | |
22.20769231 | 1 | 0.1% | |
23.04594595 | 1 | 0.1% |
Value | Count | Frequency (%) | |
99.9 | 19 | 2.2% | |
99.9 | 6 | 0.7% | |
99.9 | 1 | 0.1% | |
99.87 | 1 | 0.1% | |
99.82571429 | 1 | 0.1% |
o3
Real number (ℝ≥0)
Distinct | 836 |
---|---|
Distinct (%) | 98.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 761.3314002 |
---|---|
Minimum | 46.66666667 |
Maximum | 914.5384615 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 46.66666667 |
---|---|
5-th percentile | 292.9838141 |
Q1 | 750.3397436 |
median | 827.6076923 |
Q3 | 864.3012821 |
95-th percentile | 893.8853695 |
Maximum | 914.5384615 |
Range | 867.8717949 |
Interquartile range (IQR) | 113.9615385 |
Descriptive statistics
Standard deviation | 179.4971704 |
---|---|
Coefficient of variation (CV) | 0.23576746 |
Kurtosis | 4.654643893 |
Mean | 761.3314002 |
Median Absolute Deviation (MAD) | 46.1625 |
Skewness | -2.269225402 |
Sum | 644086.3646 |
Variance | 32219.23419 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
890.35 | 3 | 0.4% | |
872.7 | 2 | 0.2% | |
883 | 2 | 0.2% | |
874.8974359 | 2 | 0.2% | |
858 | 2 | 0.2% | |
838.9487179 | 2 | 0.2% | |
840.3 | 2 | 0.2% | |
699 | 2 | 0.2% | |
892.5384615 | 2 | 0.2% | |
618.2702703 | 1 | 0.1% | |
Other values (826) | 826 | 97.6% |
Value | Count | Frequency (%) | |
46.66666667 | 1 | 0.1% | |
46.975 | 1 | 0.1% | |
51.43589744 | 1 | 0.1% | |
51.51282051 | 1 | 0.1% | |
62.225 | 1 | 0.1% |
Value | Count | Frequency (%) | |
914.5384615 | 1 | 0.1% | |
914.1 | 1 | 0.1% | |
911.5384615 | 1 | 0.1% | |
910.5128205 | 1 | 0.1% | |
910.3076923 | 1 | 0.1% |
no2
Real number (ℝ≥0)
Distinct | 831 |
---|---|
Distinct (%) | 98.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 190.2322978 |
---|---|
Minimum | 40.35 |
Maximum | 282.974359 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 40.35 |
---|---|
5-th percentile | 85.65144231 |
Q1 | 155.6032095 |
median | 200.0259784 |
Q3 | 232.9038462 |
95-th percentile | 262.9153418 |
Maximum | 282.974359 |
Range | 242.624359 |
Interquartile range (IQR) | 77.30063669 |
Descriptive statistics
Standard deviation | 53.91284455 |
---|---|
Coefficient of variation (CV) | 0.2834053164 |
Kurtosis | -0.09057386816 |
Mean | 190.2322978 |
Median Absolute Deviation (MAD) | 37.14102564 |
Skewness | -0.679053604 |
Sum | 160936.524 |
Variance | 2906.594808 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
146.3846154 | 2 | 0.2% | |
154.3846154 | 2 | 0.2% | |
92.69230769 | 2 | 0.2% | |
215.2307692 | 2 | 0.2% | |
232.4102564 | 2 | 0.2% | |
163 | 2 | 0.2% | |
175 | 2 | 0.2% | |
243.75 | 2 | 0.2% | |
238.825 | 2 | 0.2% | |
208.1 | 2 | 0.2% | |
Other values (821) | 826 | 97.6% |
Value | Count | Frequency (%) | |
40.35 | 1 | 0.1% | |
41.3 | 1 | 0.1% | |
42.20512821 | 1 | 0.1% | |
42.43589744 | 1 | 0.1% | |
43.46153846 | 1 | 0.1% |
Value | Count | Frequency (%) | |
282.974359 | 1 | 0.1% | |
281.975 | 1 | 0.1% | |
277.7179487 | 1 | 0.1% | |
276 | 1 | 0.1% | |
275.35 | 1 | 0.1% |
noAlpha
Real number (ℝ≥0)
Distinct | 449 |
---|---|
Distinct (%) | 53.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 91.14762683 |
---|---|
Minimum | 89.05263158 |
Maximum | 93.81081081 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 89.05263158 |
---|---|
5-th percentile | 89.81205548 |
Q1 | 90.51761364 |
median | 91.025 |
Q3 | 91.69230769 |
95-th percentile | 92.8872549 |
Maximum | 93.81081081 |
Range | 4.758179232 |
Interquartile range (IQR) | 1.174694056 |
Descriptive statistics
Standard deviation | 0.9204398679 |
---|---|
Coefficient of variation (CV) | 0.01009834156 |
Kurtosis | -0.1639007852 |
Mean | 91.14762683 |
Median Absolute Deviation (MAD) | 0.575 |
Skewness | 0.4515707662 |
Sum | 77110.89229 |
Variance | 0.8472095505 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
91 | 11 | 1.3% | |
90.8974359 | 10 | 1.2% | |
91.02564103 | 8 | 0.9% | |
90.84615385 | 8 | 0.9% | |
90.87179487 | 7 | 0.8% | |
91.69230769 | 7 | 0.8% | |
90.82051282 | 6 | 0.7% | |
90.41025641 | 6 | 0.7% | |
90.74358974 | 6 | 0.7% | |
90.9 | 6 | 0.7% | |
Other values (439) | 771 | 91.1% |
Value | Count | Frequency (%) | |
89.05263158 | 1 | 0.1% | |
89.05405405 | 1 | 0.1% | |
89.10526316 | 1 | 0.1% | |
89.11111111 | 1 | 0.1% | |
89.12121212 | 1 | 0.1% |
Value | Count | Frequency (%) | |
93.81081081 | 1 | 0.1% | |
93.7027027 | 1 | 0.1% | |
93.67567568 | 1 | 0.1% | |
93.66666667 | 1 | 0.1% | |
93.60526316 | 1 | 0.1% |
no2Alpha
Real number (ℝ≥0)
Distinct | 560 |
---|---|
Distinct (%) | 66.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 89.17862518 |
---|---|
Minimum | 85.71794872 |
Maximum | 99.81081081 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 6.6 KiB |
Quantile statistics
Minimum | 85.71794872 |
---|---|
5-th percentile | 86.64326923 |
Q1 | 87.51370656 |
median | 88.64102564 |
Q3 | 90.42291667 |
95-th percentile | 93.09615385 |
Maximum | 99.81081081 |
Range | 14.09286209 |
Interquartile range (IQR) | 2.909210103 |
Descriptive statistics
Standard deviation | 2.14911548 |
---|---|
Coefficient of variation (CV) | 0.02409899767 |
Kurtosis | 1.640512446 |
Mean | 89.17862518 |
Median Absolute Deviation (MAD) | 1.331175131 |
Skewness | 1.148867842 |
Sum | 75445.1169 |
Variance | 4.618697347 |
Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) | |
87.92307692 | 7 | 0.8% | |
87.69230769 | 7 | 0.8% | |
87.76923077 | 7 | 0.8% | |
87.5 | 6 | 0.7% | |
87.1 | 5 | 0.6% | |
87.8 | 5 | 0.6% | |
87.17948718 | 5 | 0.6% | |
87.275 | 5 | 0.6% | |
86.69230769 | 4 | 0.5% | |
87.23076923 | 4 | 0.5% | |
Other values (550) | 791 | 93.5% |
Value | Count | Frequency (%) | |
85.71794872 | 1 | 0.1% | |
85.84615385 | 1 | 0.1% | |
85.875 | 1 | 0.1% | |
85.975 | 1 | 0.1% | |
86 | 1 | 0.1% |
Value | Count | Frequency (%) | |
99.81081081 | 1 | 0.1% | |
98.2972973 | 1 | 0.1% | |
98.25641026 | 1 | 0.1% | |
98.12820513 | 1 | 0.1% | |
97.3 | 1 | 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
df_index | data | NO2Arpat | tair | rad | o3 | no2 | noAlpha | no2Alpha | |
---|---|---|---|---|---|---|---|---|---|
0 | 2020-10-16 13:00:00 | 16/10/2020 13:00 | 32.0 | 16.128947 | 78.426316 | 829.973684 | 195.500000 | 91.157895 | 89.210526 |
1 | 2020-10-16 14:00:00 | 16/10/2020 14:00 | 26.0 | 17.792105 | 70.884211 | 847.657895 | 211.578947 | 91.052632 | 90.763158 |
2 | 2020-10-16 15:00:00 | 16/10/2020 15:00 | 34.0 | 20.611111 | 54.911111 | 870.944444 | 194.055556 | 89.861111 | 91.666667 |
3 | 2020-10-16 16:00:00 | 16/10/2020 16:00 | 33.0 | 20.522500 | 53.110000 | 897.400000 | 205.150000 | 91.000000 | 91.175000 |
4 | 2020-10-16 17:00:00 | 16/10/2020 17:00 | 54.0 | 18.324324 | 63.116216 | 870.513514 | 226.675676 | 92.243243 | 91.864865 |
5 | 2020-10-16 18:00:00 | 16/10/2020 18:00 | 58.0 | 15.188889 | 77.044444 | 859.194444 | 242.944444 | 92.416667 | 89.583333 |
6 | 2020-10-16 19:00:00 | 16/10/2020 19:00 | 56.0 | 13.643243 | 86.251351 | 869.459459 | 255.864865 | 93.135135 | 91.108108 |
7 | 2020-10-16 20:00:00 | 16/10/2020 20:00 | 41.0 | 12.323684 | 92.392105 | 874.552632 | 257.473684 | 91.657895 | 88.421053 |
8 | 2020-10-16 21:00:00 | 16/10/2020 21:00 | 38.0 | 11.344737 | 95.597368 | 869.842105 | 251.736842 | 91.210526 | 89.052632 |
9 | 2020-10-16 22:00:00 | 16/10/2020 22:00 | 39.0 | 10.902632 | 96.413158 | 866.000000 | 249.447368 | 91.052632 | 88.947368 |
Last rows
df_index | data | NO2Arpat | tair | rad | o3 | no2 | noAlpha | no2Alpha | |
---|---|---|---|---|---|---|---|---|---|
836 | 2020-11-30 14:00:00 | 30/11/2020 14:00 | 20.0 | 19.256410 | 30.951282 | 725.000000 | 106.820513 | 89.820513 | 89.384615 |
837 | 2020-11-30 15:00:00 | 30/11/2020 15:00 | 39.0 | 19.063158 | 33.521053 | 719.263158 | 126.421053 | 91.973684 | 88.342105 |
838 | 2020-11-30 16:00:00 | 30/11/2020 16:00 | 74.0 | 14.218919 | 46.145946 | 733.702703 | 154.189189 | 92.216216 | 88.702703 |
839 | 2020-11-30 17:00:00 | 30/11/2020 17:00 | 57.0 | 11.555263 | 56.257895 | 776.736842 | 185.605263 | 93.105263 | 88.394737 |
840 | 2020-11-30 18:00:00 | 30/11/2020 18:00 | 80.0 | 10.294444 | 63.538889 | 839.583333 | 204.555556 | 92.888889 | 89.500000 |
841 | 2020-11-30 19:00:00 | 30/11/2020 19:00 | 62.0 | 8.977500 | 71.537500 | 861.275000 | 211.175000 | 92.950000 | 91.050000 |
842 | 2020-11-30 20:00:00 | 30/11/2020 20:00 | 73.0 | 8.061111 | 77.361111 | 887.583333 | 215.388889 | 92.722222 | 93.555556 |
843 | 2020-11-30 21:00:00 | 30/11/2020 21:00 | 60.0 | 7.385000 | 72.305000 | 891.875000 | 211.275000 | 91.575000 | 88.850000 |
844 | 2020-11-30 22:00:00 | 30/11/2020 22:00 | 51.0 | 7.968421 | 66.963158 | 872.368421 | 160.605263 | 91.052632 | 88.131579 |
845 | 2020-11-30 23:00:00 | 30/11/2020 23:00 | 30.0 | 8.852941 | 63.150000 | 836.411765 | 154.323529 | 90.323529 | 87.411765 |