Probable outliers removal
Last updated
Last updated
Next, we remove sales which are improbable outliers range - ranges are created using the standard deviation. Additionally, sales that are lower than 75% of the moving average and below the standard deviation band are removed (whichever is value is lower).
We define the Truncated Mean as the factor that's used to calculate the average over prices. It's obtained by taking the 50 previous sales (excluding the current sale and outliers), ordering by sale size, discarding the top and bottom three sales, then taking the mean of the 44 middle values.
The minimum lower bound range is used to prevent cases when a lower standard deviation range gets too close to the truncated mean and no longer reflects downward market movement.
75% of the mean for the lower band has been backtested with the most liquid collections and has shown to be sufficient enough to react to downward price swings.
The Sample Standard Deviation is used to find out a price ranges at which sales have the highest chance of occurrence.
These ranges will be later used to validate whether a new sale is within the possible range. Truncated Mean values are used in the formula:
sample standard deviation | |
the number of observations (valid sales) | |
the observed values of a sample item (sale amount) | |
the truncated mean value of the observations (valid sales) |
Next. we're setting the range in which sale is not an outlier.
Mean -X* ≤ NewSalePrice ≤ Mean + X* where Mean = truncated mean = sample standard deviation X = quanitity
NewSalePrice is then checked whether it is within the range. If it's outside, set as outlier and it will not be included in further calculations.
If the sale is valid, it will be checked whether it is a floor value or not.