Opened 22 months ago

Last modified 22 months ago

#34325 closed Cleanup/optimization

PercentRank confusion — at Initial Version

Reported by: dennisvang Owned by: nobody
Component: Documentation Version: 4.1
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The documentation for the PercentRank window function says:

Computes the percentile rank of the rows in the frame clause. This computation is equivalent to evaluating:

(rank - 1) / (total rows - 1)

(my emphasis)

However, I'm not so sure "percentile rank" is the correct term.

If you look up the (statistical) term "percentile rank" online, you'll find various definitions, ranging from

(CF - 0.5 * F) / N

where CF—the cumulative frequency—is the count of all scores less than or equal to the score of interest, F is the frequency for the score of interest, and N is the number of scores in the distribution.

to something like

<number of values less than the score of interest> / <total number of values in the data set>

However, none exactly matches the definition in the Django docs.

Note also that the documentation for the percent_rank function in the SQLite and PostgreSQL database backends does not mention "percentile rank". Instead, they use the term "relative rank."

To prevent confusion, wouldn't it be better to use the same terminology as the database backends?

Change History (0)

Note: See TracTickets for help on using tickets.
Back to Top