# Probable outliers removal

Last updated

Last updated

Next, we remove sales which are improbable outliers range - ranges are created using the standard deviation. Additionally, sales that are lower than 75% of the moving average and below the standard deviation band are removed (whichever is value is lower).

Calculating the Truncated Mean (Average Price)

We define the Truncated Mean as the factor that's used to calculate the average over prices. It's obtained by taking the 50 previous sales (excluding the current sale and outliers), ordering by sale size, discarding the top and bottom three sales, then taking the mean of the 44 middle values.

Minimum Lower Bound Range

The minimum lower bound range is used to prevent cases when a lower standard deviation range gets too close to the truncated mean and no longer reflects downward market movement.

75% of the mean for the lower band has been backtested with the most liquid collections and has shown to be sufficient enough to react to downward price swings.

Setting Ranges with the Standard Deviation

The Sample Standard Deviation is used to find out a price ranges at which sales have the highest chance of occurrence.

These ranges will be later used to validate whether a new sale is within the possible range. Truncated Mean values are used in the formula:

Check if the Sale is Within the Range

Next. we're setting the range in which sale is not an outlier.

NewSalePrice is then checked whether it is within the range. If it's outside, set as outlier and it will not be included in further calculations.

If the sale is valid, it will be checked whether it is a floor value or not.

*Mean -X** *≤ NewSalePrice ≤ Mean + X***
where
*Mean = truncated mean
= sample standard deviation
X = quanitity

sample standard deviation

the number of observations (valid sales)

the observed values of a sample item (sale amount)

the truncated mean value of the observations (valid sales)