Analysis: Alleged Bias Found in Extrapolation Audits Part IV

By Frank Cohen, MPA
June 27, 2018

The PIM is a woefully inadequate guide for audits leveraging extrapolation.

EDITOR’S NOTE: This is the fourth in a series of reports on alleged bias the author has uncovered in extrapolation audits.

This is part four of a three-part series expressing my disappointment in chapter 8 of the Medicare Program Integrity Manual (PIM).

Did you get that? Part IV of a three-part series. That’s right.

After I finished up the trilogy, I realized that I missed one very important aspect of inferential statistics and extrapolation, and that was the issue of outliers. Actually, I could easily be forgiven for forgetting this very important topic, because, and this is really strange, nowhere within the entirety of chapter 8 of the PIM is the word “outlier” found. The reason this is so strange is that the issue of outliers is huge when it comes to inferential statistical methods – and in particular, extrapolation.

First, a bit of background. In addition to not mentioning outliers, chapter 8 of the PIM also fails to mention the “median” as a critical metric when considering inferential statistical calculations – and as I will explain later, the use of the median over the mean is quite common in the presence of outliers. I know this because in a recent federal trial in which I was the statistical expert, the prosecutor asked me to find anywhere in Chapter 8 where the word “median” was used, and, knowing this was just a setup question, I responded honestly that it was not. Her conclusion, then, was that if the PIM didn’t mention it, it couldn’t be a viable metric for extrapolation. That’s simply false, and I would venture to say that I would get a resounding agreement from not just statisticians, but anyone who studied high-school mathematics.

Outliers are nearly always a part of a billing and coding audit for reasons that were discussed in prior articles. Specifically, billing audits are almost always heavily skewed right because the paid amount is bounded on the left by zero. This is because the least a provider can be paid for any procedure is, well, nothing. And while there is a practical limit to the maximum a provider can be paid, it can be very high, and as such, this skews the data to the right. In the case of this type of a skewed database, it is almost always more appropriate and accurate to use the median over the mean, and this can be particularly true when outliers are present in the data.

According to the National Institute of Standards and Technology, a part of the department of commerce, “an outlier is an observation that lies an abnormal distance from other values in a random sample from a population. In a sense, this definition leaves it up to the analyst (or a consensus process) to decide what will be considered abnormal.” Notice the part that says that the determination of whether a data point (or points) constitute an outlier is up to the analyst to decide. It also mentions a consensus process, which means using established calculations based on standards of statistical practice.

Now, I could write a whole book on outliers, but there are some basic tenets that can be easily understood and observed in order for an extrapolation to accurately represent some estimate of overpayment in a billing audit.

To begin with, outliers are common because of the way in which most auditors identify the audit unit. The most common unit is the claim, and as has been discussed in prior articles in this series, the claim is not usually the best unit. The reason for this is because a single claim can contain many claim lines, each of which represents a different procedure code, code type, or category, as well as very variable paid amounts. In fact, all one has to do is look at the paid claims data within their own organization to see what a mess this can be with regard to statistical extrapolation.

For example, I can have five claims that all represent the same diagnosis with different procedure codes in different code categories, such as E&M, imaging, surgical, pathology, etc. And within those five claims, I could show you the same code present, with each service paid as some variable amount and even not paid at all. Also, when you mix claims from different providers in different specialties for different levels of patient acuity, you are going to often see huge swings in payments. Again, for example, I can show you data for a recent audit wherein the claim paid amount went from $0.88 to $6,243. That’s a huge range, and it contained lots of outliers, both high and low.

What is the effect of outliers on extrapolation? Some of this depends what metrics are used for measuring central tendency. If the average (or mean) is used, then outliers will have a huge impact because the mean is very sensitive to these outliers. And while any data set can have both low and high outliers, the majority are made up of high outliers, based on the nature of the billing process. Not too long ago, I testified in federal court as a statistical expert, defending a physician against a couple of extrapolations that were performed by a Zone Program Integrity Contractor (ZPIC). I was trying to explain to the judge why outliers were so destructive when using the average rather than the median, and he asked me to give him an example, so here is what I told him:

“A statistician walks into a bar with 100 patrons. With his pencil and paper, he goes around and asks each patron what their income was over the past year. After getting the responses, the statistician calculates an average income of $48,000, and because the data was normally distributed, a median income of $48,000. So far, so good. Then, Bill Gates walks into the bar and orders a beer. Now, on average, everyone in the bar is a multi-millionaire, but the median remains the same: $48,000.”

The point is this; since we are pretty much stuck with the use of the average, we need to aggressively address the issue of outliers. Remember, the median is not mentioned anywhere in chapter 8, so auditors don’t feel as though they need to consider it, even though it is accepted as a standard of statistical practice (and in most cases, the more appropriate and accurate metric).

There are three ways to address the issue of outliers. The first is through the use of stratification, which is the process of separating a population into more similar subsets, and while I will argue against the use of the paid amount as the basis for stratification, that is the variable of interest used by the auditors. Using the paid amount, there are statistical calculations and processes that can be employed to effectively and properly stratify a population.

Unfortunately, this requires a bit of effort, and I have never seen a single audit for which the stratification was based on any amount of logic or reason. Mostly, the auditors just guess at some break point and leave it at that. Remember, chapter 8 does not require that the auditors explain themselves when it comes to their logic (or lack of logic) in making any statistical decisions, so we are often left to second-guessing, which is never a good strategy.

The second method would be to change either or both the variable of interest and the unit. For example, instead of using the paid amount, I could stratify by code type or category. I covered this extensively in Part III of this series. Regarding units, if we went to a more granular level, such as the specific procedure code, while we would still see some variability in payment, it wouldn’t be nearly as much, since we would not be trying to balance a composite payment based on some variable number of units within the claim. In some cases, such as the federal trial referenced above, the auditor would have been better served using the beneficiary rather than the claim as the unit. It would have normalized the payment amount and smoothed the payment variability.

The third method is to exclude the outliers from the extrapolation, which is the preferred method, since it is never appropriate to include an outlier in an extrapolation. The way to do this is by using something called a certainty stratum, and this is mentioned in section 8.4.11.1 in Chapter 8, as follows: “If it is believed that the amount of overpayment is correlated with the amount of the original payment and the universe distribution of paid amounts is skewed to the right, i.e., with a set of extremely high values, it may be advantageous to define a ‘certainty stratum,’ selecting all of the sampling units starting with the largest value and working backward to the left of the distribution.”

Reread the two parts of this; the first is a bit conditional on the fact that the overpayment is correlated to the original payment amount, which in my experience is rarely the case. Irrespective, since the guidelines do nothing to actually address that issue, this is something that auditors almost always simply accept as fact without subjecting the data to even the most basic statistical tests. The second part mentions the need to control for “extremely high values,” yet once again, it does nothing to define this.

Are they talking about outliers? If so, what statistical method or methods are recommended? Well, the answer is none, because the PIM guidelines are there as a shield to protect the auditor from reasonable objections and not as a true source of statistical validity or fairness. It even says that you should start with the highest values and work backwards. But backwards to where? Well, they might say that this is left to the statistician to determine, but whose statistician? Obviously, they can do whatever they want and defend it using the nebulousness of the wording here, but if I should contest their work, for example stating that they didn’t follow any standard or logic, their defense again would be that they are not required to explain themselves.

This section further states: “When a stratum is sampled with certainty, i.e., auditing all of the sample units contained therein, the contribution of that stratum to the overall sampling error is zero.” This means that those data points that are placed into the certainty stratum are, in effect, being separated from the rest of the sample and therefore should not be included in the extrapolation – and this is very true. Here is the next sentence: ”In that manner, extremely large overpayments in the sample are prevented from causing poor precision in estimation.” This, again, is quite true, but one of the defenses that I have seen put forth by a ZPIC for not including a certainty stratum is that they are not held to any standard regarding precision. That is true. Even though precision is an important part of the U.S. Department of Health and Human Services (HHS) Office of Inspector General (OIG) audit process and specific guidelines are given by the Office of Management and Budget (OMB) and in the Federal Register, chapter 8 says nothing about precision – and as such, in general, the auditors don’t exhibit any obligation to be bound by any standards.

Finally, we read this from the same section: “In practice, the decision of whether or not to sample the right tail with certainty depends on fairly accurate prior knowledge of the distribution of overpayments, and also on the ability to totally audit one stratum while having sufficient resources left over to sample from each of the remaining strata.” This means that the decision to use a certainty stratum is based on whether the auditor has the ability to conduct a distribution and outlier analysis – yet, out of the hundreds of extrapolation audits in which I have been engaged as a statistical expert, I have rarely seen any auditor produce these types of analyses.

But it goes on to say that irrespective of the need or the importance or the contribution to precision and accuracy, the auditor does not have to employ this method if it involves too much work on their part. Is that even possible? That’s like saying that you have a right to a jury trial, unless the process of picking a jury will take too much time and effort, leaving not enough time for a fair trial. What a bunch of hooey.

In the interest of brevity, I am not going to repeat all of my conclusions from the prior three articles. In fact, I don’t think I have to, because I am confident that any reasonable person, statistician or not, understands how inadequate and outdated chapter 8 has become when using it as a tool to inappropriately recoup billions of dollars from otherwise hard-working and honest healthcare providers. I believe we are reaching a boiling point and that there is going to be a revolt against CMS regarding the use of these guidelines. You can deny people their rights for only so long before you break the camel’s back and this, in my opinion, is that straw.

And that’s the world according to Frank.

Program Note: Register to attend the Frank Cohen webcast today on this subject at 1:30 pm ET.

TAGS: Auditing, Reimbursement

Frank Cohen, MPA

Frank Cohen is Senior Director of Analytics and Business Intelligence for VMG Health, LLC. He is a computational statistician with a focus on building risk-based audit models using predictive analytics and machine learning algorithms. He has participated in numerous studies and authored several books, including his latest, titled; “Don’t Do Something, Just Stand There: A Primer for Evidence-based Practice”

Featured Webcasts

2026 IPPS Masterclass: Final Rule Update with Expert Insights and Analysis

Only ICD10monitor delivers what you need: updates on must-know changes associated with the FY26 IPPS, including new ICD-10-CM/PCS codes, CCs/MCCs, and MS-DRGs, plus insights, analysis and answers to your questions from two of the country’s most respected subject matter experts.

August 12, 2025

2026 IPPS Masterclass Day 3: MS-DRG Shifts and NTAPs

This third session in our 2026 IPPS Masterclass will feature a review of FY26 changes to the MS-DRG methodology and new technology add-on payments (NTAPs), presented by nationally recognized ICD-10 coding expert Christine Geiger, MA, RHIA, CCS, CRC, with bonus insights and analysis from Dr. James Kennedy.

August 14, 2025

2026 IPPS Masterclass Day 2: Master ICD-10-PCS Changes

This second session in our 2026 IPPS Masterclass will feature a review the FY26 changes to ICD-10-PCS codes. This information will be presented by nationally recognized ICD-10 coding expert Christine Geiger, MA, RHIA, CCS, CRC, with bonus insights and analysis from Dr. James Kennedy.

August 13, 2025

2026 IPPS Masterclass Day 1: Master ICD-10-CM Changes

This first session in our 2026 IPPS Masterclass will feature an in-depth explanation of FY26 changes to ICD-10-CM codes and guidelines, CCs/MCCs, and revisions to the MCE, presented by presented by nationally recognized ICD-10 coding expert Christine Geiger, MA, RHIA, CCS, CRC, with bonus insights and analysis from Dr. James Kennedy.

August 12, 2025

Featured Webcasts

The Two-Midnight Rule: New Challenges, Proven Strategies

RACmonitor is proud to welcome back Dr. Ronald Hirsch, one of his most requested webcasts. In this highly anticipated session, Dr. Hirsch will break down the complex Two Midnight Rule Medicare regulations, translating them into clear, actionable guidance. He’ll walk you through the basics of the rule, offer expert interpretation, and apply the rule to real-world clinical scenarios—so you leave with greater clarity, confidence, and the tools to ensure compliance.

June 19, 2025

Open Door Forum Webcast Series

Bring your questions and join the conversation during this open forum series, live every Wednesday at 10 a.m. EST from June 11–July 30. Hosted by Chuck Buck, these fast-paced 30-minute sessions connect you directly with top healthcare experts tackling today’s most urgent compliance and policy issues.

June 11, 2025

Open Door Forum: Fraud Detection in Healthcare – Spot Red Flags, Avoid Liability

Fraud convictions don’t just punish a few bad claims; they can wipe out years of reimbursements. Don’t wait for an audit to learn the rules. Join Frank Cohen, MPA, for a live Q&A on spotting red flags, avoiding liability, and protecting your practice. Register now and bring your questions!

July 23, 2025

Open Door Forum: The Changing Face of Addiction: Coding, Compliance & Care

Substance abuse is everywhere. It’s a complicated diagnosis with wide-ranging implications well beyond acute care. The face of addiction continues to change so it’s important to remember not just the addict but the spectrum of extended victims and the other social determinants and legal ramifications. Join John K. Hall, MD, JD, MBA, FCLM, FRCPC, for a critical Q&A on navigating substance abuse in 2025. Register today and be a part of the conversation!

July 16, 2025

Analysis: Alleged Bias Found in Extrapolation Audits Part IV

Frank Cohen, MPA

Related Stories

Insolvent but Not Invisible: The RAC Audit Hangover in Distressed Healthcare Transactions

Leave a Reply

Featured Webcasts

2026 IPPS Masterclass: Final Rule Update with Expert Insights and Analysis

2026 IPPS Masterclass Day 3: MS-DRG Shifts and NTAPs

2026 IPPS Masterclass Day 2: Master ICD-10-PCS Changes

2026 IPPS Masterclass Day 1: Master ICD-10-CM Changes

Trending News

Unpacking the RUC Time Study: Where Physician Work Valuation Meets Compliance Risk

A Troubling Disregard for the Law

One-on-One Time and a Light Touch: Tips for Conducting Interviews on Legal Matters

“I’ve Never Heard This Before” – How Closed and Open Minds Can React with the Same Words

Featured Webcasts

The Two-Midnight Rule: New Challenges, Proven Strategies

Open Door Forum Webcast Series

Open Door Forum: Fraud Detection in Healthcare – Spot Red Flags, Avoid Liability

Open Door Forum: The Changing Face of Addiction: Coding, Compliance & Care

Trending News

CMS Crackdown on Medicaid Coverage for Undocumented Patients Raises Red Flags for Hospitals, Case Managers, State Programs

Content, Schedule Announced for New RACmonitor/ICD10monitor Webcast Series

What Coders Need to Know: Anticoagulant versus Antiplatelet – Coding Considerations

Compliance as a Revenue Strategy

Stay Connected

News

Account

Info