Digital Tracking and Collection: See, Store, Share Wisely

Modea

Written on January 30, 2023

Healthcare

Digital tracking, a longstanding practice, involves companies using codes and scripts to monitor user behavior on websites or mobile apps. The primary aim is to enhance marketing decisions for a personalized user experience, benefiting both parties when implemented correctly. However, poor execution or the unauthorized sale of collected data to third parties without consent raises serious concerns on various levels.

In healthcare, providers use tracking on consumer-facing products to enhance the digital care journey. However, healthcare tracking is more intricate than in e-commerce. The data collected often involves sensitive information or Protected Health Information (PHI), adding complexity to when and how tracking can be employed, irrespective of intent.

The ever-changing landscape of consumer preferences, digital device usage, and privacy expectations prompts ongoing revisions in rules governing data tracking in healthcare. HIPAA, established in 1996 and supplemented by Privacy and Security Rules in 2003, addresses these concerns. With the surge in technology use and escalating online data sharing, there’s an amplified focus on the collection and use of Protected Health Information (PHI), further intensified by recent data breaches.

A bulletin from the Department of Health and Human Services details guidelines for HIPAA-covered entities and business associates on the use of online tracking technologies. The document provides insights into how healthcare organizations, termed ‘regulated entities’ under HIPAA, should navigate digital tracking and measurement.

What do the new guidelines mean for your organization?

The HHS decisively spells out that a ‘regulated entity’ cannot improperly share PHI with a third party:

Regulated entities (note: healthcare organizations = ‘regulated entities’) disclose a variety of information to tracking technology vendors through tracking technologies placed on a regulated entity’s website or mobile app, including individually identifiable health information (IIHI)19 that the individual provides when they use regulated entities’ websites or mobile apps. This information might include an individual’s medical record number, home or email address, or dates of appointments, as well as an individual’s IP address or geographic location, medical device IDs, or any unique identifying code.

20 All such IIHI collected on a regulated entity’s website or mobile app generally is PHI, even if the individual does not have an existing relationship with the regulated entity and even if the IIHI, such as IP address or geographic location, does not include specific treatment or billing information like dates and types of health care services.21 This is because, when a regulated entity collects the individual’s IIHI through its website or mobile app, the information connects the individual to the regulated entity (i.e., it is indicative that the individual has received or will receive health care services or benefits from the covered entity), and thus relates to the individual’s past, present, or future health or health care or payment for care.22

Put plainly, the above excerpt underscores the importance of ensuring that your healthcare organization, as a ‘regulated entity,’ shares information with third-party vendors or systems in a responsible manner. In essence, refrain from transmitting identifiable PHI to third-party vendors.

Okay, but what about IIHI?

In addition to providing guidelines on PHI collection, the HHS also specifies a broad range of IIHI (Individually Identifiable Health Information) that should be avoided when collecting data on your digital platforms. For instance, someone clicking on a provider profile on your organization’s website, on its own, doesn’t pose a problem and is acceptable to collect.

However, there are two scenarios where information becomes problematic, or “individually identifiable,” and should not be collected or transmitted to a third party: first, if an individual can be reasonably identified (through the collection of commonly identifiable information like names, phone numbers, or IP addresses), and second, if that information is shared with third parties.

Want to avoid problems? Avoid collecting IIHI as much as possible.

There are three primary offenders of personal information tracking that make it individually identifiable:

Precise geolocation
IP Address or other unique identifiers (think advertising)
Personal information is entered in text input fields (think a form or login)

What is the best way to ensure we’re not passing identifiable data to third-party vendors?

The bulletin, helpfully, distinguishes three broad categories of digital properties:

User-Authenticated web pages (requires a user to login, such as MyChart)
Mobile Applications (delivered by and on behalf of a healthcare organization)
Unauthenticated web pages (does not require a login, like your standard consumer-facing website)

Let’s break each of these down a bit more with the important information you should know.

User-authenticated web pages:

A user-authenticated web page, like MyChart in healthcare, requires users to log in with identifiable information for access. The simplest solution is to avoid placing third-party tracking, such as pixels or session-recording software, on your MyChart or patient portal instance.

If you require meaningful user behavior tracking from a MyChart instance, explore secure options like an on-premise server using Matomo or a custom tracking implementation (though these can be costly). If opting for the free Google Analytics tools, consult with an implementation expert to ensure secure setup.

Mobile applications:

mobile app which help manage non-identifiable data for tracking

Mobile applications, especially those listed with EPIC or Cerner integrations, have a higher likelihood of exposing PHI. For example: if you are using biometrics to let a user access the application, that is personal health information.

It is important to take the necessary steps to ensure that data cannot be identified BEFORE collection. Cleaning post-collection is not good enough as the third-party vendor can’t filter the field in their database after collection and be able to say ‘great! we’re done!’ De-identification must be done before the point of collection.

How to manage this:

For mobile applications, GA4 automatically masks IP addresses by default.
Most mobile applications for healthcare do not log the precise location of a user, but those with wayfinding built-in may be at risk — especially if that wayfinding data is passed to or stored with any third parties.
Additionally: Advertising should be disabled for mobile applications.
The guidance mentions DEVICE ID, but to clarify: our understanding is that “device ID” is assigned to the individual phone/laptop/tablet by the manufacturer, not the value reported by Google Analytics. That refers to a unique app installation ID, and one device or individual could have multiple app installation IDs.

If these conditions are met, you can prevent most PHI from being collected as well as prevent identifiable PHI from being passed to third parties.

Unauthenticated web pages:

Unauthenticated web pages are ones that don’t require the user to log in. In other words, your standard-issue, publicly available internet web page. The majority of health system digital properties fall under this category. As the HHS guidance states, as tracking technologies on these unauthenticated webpages generally do not access PHI, HIPAA regulations generally do not apply.

However, HHS does indicate two instances in which IIHI could be exposed or collected:

The login page of a patient portal, linked to from the main unauthenticated website
A search for symptoms or an appointment request form that doesn’t require authentication, but does require a user to enter personal information

How to manage these risks:

Do not allow third parties to collect keystrokes or any information entered on a form, whether via tracking or session recording. (Many organizations use tools like HotJar, CrazyEgg or Inspectlet to record sessions and better understand user behavior. Be extremely mindful of using these tools on healthcare websites. Read the fineprint and understand what these tools do and do not collect – and what they can and cannot be configured to do.)

Do not collect individual IP addresses. GA4 automatically masks IP addresses by default. If you haven’t upgraded to GA4, develop a plan to do so very soon. Also, once more, turn off the advertising ID for Google Analytics.

Managing these areas can help mitigate risk to avoid collecting IIHI on your digital properties.

Keep your marketing data de-identified.

There is one additional area of considerable risk: your Google Tag Manager account.

GTM accounts often harbor neglected and outdated marketing pixels, custom HTML tags for specific campaigns, and other third-party insertions. Look through your existing GTM account and audit your marketing pixels and their configuration. Remove any that are unnecessary, and make sure you understand what they are collecting.

Pixels come with their own configuration and can send information to a third-party vendor that the vendor then controls. Once a third party gets data from a pixel, it can freely use and even resell the information to others. Be extremely mindful of what information these marketing pixels are collecting.

With those points in mind, you should be well set-up to keep your data de-identified.

In conclusion

Use this guide to ensure your organization complies with the latest HHS and HIPAA guidelines on PHI and IIHI.

If you want the TL; DR version or need to share key takeaways with others in your organization, here you go:

Turn off advertising ID to prevent unique advertising ID sharing.
Upgrade to GA4. (We recommend this for many reasons, not least of all that older versions of Google Analytics are being sunset July 2023. In this case, the upgrade to GA4 is critical to ensure automatic IP address masking).
Be extremely mindful of collecting user-entered text or login credentials. If you are using session-recording software, thoroughly review the documentation and configuration of your software to ensure no text or keystroke inputs are being collected, stored, or shared.
Make sure that third-party tags and pixels in your Google Tag Manager account are in place only to collect the most critical information, and nothing more.