quantile的逆就是CDF (cumulative distribution function)。 我看了一下這個quantile transformer in python,講的是另外一個東西,是將一組數據轉化成uniform or normal distributed。 以下是這個函數source code下的註解。 This method transforms the features to follow a uniform or a normal distribution. Therefore, for a given feature, this transformation tends to spread out the most frequent values. It also reduces the impact of (marginal) outliers: this is therefore a robust preprocessing scheme. The transformation is applied on each feature independently. First an estimate of the cumulative distribution function of a feature is used to map the original values to a uniform distribution. The obtained values are then mapped to the desired output distribution using the associated quantile function. Features values of new/unseen data that fall below or above the fitted range will be mapped to the bounds of the output distribution. Note that this transform is non-linear. It may distort linear correlations between variables measured at the same scale but renders variables measured at different scales more directly comparable.
a = [10,20,30,40] b = qtransform(a) inverse_qtransform(b) 会不会和a相等?
负责任的说 python没有quantile transform这个东西,不知道扩展包里有没有
有的。请看
sklearn.preprocessing.QuantileTransformer
请问你知道怎么inverse 这个吗?即使python没有quantile transform也无所谓。quantile transform这个概念存在,且是基本的统计学。我就问存在它的逆变换吗?
不是所有的函数都有反函数,这个尤其没有。
谢谢!
既然没有反函数,那么在当今最红火的大模型(transformer)里面,为啥用它呢?
变可以,但是变回去不可以。这不是个糟糕的模型吗?
我看了一下這個quantile transformer in python,講的是另外一個東西,是將一組數據轉化成uniform or normal distributed。
以下是這個函數source code下的註解。
This method transforms the features to follow a uniform or a normal distribution. Therefore, for a given feature, this transformation tends to spread out the most frequent values. It also reduces the impact of (marginal) outliers: this is therefore a robust preprocessing scheme.
The transformation is applied on each feature independently. First an estimate of the cumulative distribution function of a feature is used to map the original values to a uniform distribution. The obtained values are then mapped to the desired output distribution using the associated quantile function. Features values of new/unseen data that fall below or above the fitted range will be mapped to the bounds of the output distribution. Note that this transform is non-linear. It may distort linear correlations between variables measured at the same scale but renders variables measured at different scales more directly comparable.