why-you-should-take-part-in-a-clinical-trial
Research

Improving Participant Anonymization

How can we best protect participant identities in research? Here are our top tips and methods
Tom
Oct 17, 2022
(3 min)

On a need-to-know basis

In this data-heavy world we live in, our personal information is everywhere. Our names, our address, and even our family history. While so much of our identity is in the public domain - and there is not much we can do about it - the management of personal data has become a staple of all ethical data collection processes. Things are improving. General rules-of-thumb for collecting data is (i) to keep it as impersonal as possible, (ii) to collect as little as possible and (iii) to collect only what you need.

When we assign participants to groups in a clinical trial or research study, knowing their name is possibly the most irrelevant part of the trial. Names offer enormous clues to personal identity, but are almost always unrelated to the actual research question. How do we ensure that we recruit and enrol participants and keep identifiers as cryptic and secure as possible? Trialflare has a number of methods in place which we use to protect participant identity. Here are a few and how they will be useful for you in your research.

[Method 1] Sequential Numbering [1, 2...]

Possibly the most basic numbering system involved in research - beginning from 1 and typically increasing incrementally from 1 to 2 to 3 and so on. We can also use this method to add a specific prefix of suffix to the sequence. For example [001, 002, 003...], [P-001-M, P-002-F, P-003-NB...] and so on.

Pro's of this method
  • Very easy to setup.
  • Easy for people of any skill level to work with (participant or any administrator).
  • Very manageable with most statistical software packages.
  • A very "human" way of working with numbers and IDs.

Con's of this method
  • Human error can be higher as IDs are so similar.
  • Without an additional prefix or suffix or leading numbers, numbers are not very unique.
  • Other participant IDs are easy to guess.
  • Using digital systems which are not password protected, there is a chance that the wrong participant can login and submit data on behalf of someone else.

[Method 2] Random Word Chaining [singing-green-monkey, dancing-blue-giraffe...]

Commonly used in more casual (often non-medical) studies, but gaining traction with time, is random word chaining. Using comprehensive word libraries, words can be hyphenated together to generate truly unique participant IDs.

Pro's of this method
  • Words can be very memorable.
  • Combinations are much more unique.
  • Combinations are much less likely to be guessed by another person.
  • Huge variety in useable words.

Con's of this method
  • Word libraries are language restricted.
  • Illiteracy or dyslexia can make remembering or repeating these sequences troublesome for some participants and even research team members.
  • Word association bias or confusion with other known phrases.
  • Generated word combinations might have meaning to someone which could be misconstrued as offensive or insensitive.

[Method 3] Random Letter-Number Hybridisation [8xpea, wu3cc...]

Concatenating letters and numbers together can provide an enormous amount of power in participant ID uniqueness.

Pro's of this method
  • Much lower word or number association bias.
  • Combinations are much more unique.
  • Combinations are much less likely to be guessed by another person.
  • Providing additional letters or numbers can add enormous pools of potentially useable IDs.

Con's of this method
  • Still some language restrictions, for example, if written language is fundamentally different.
  • Illiteracy or dyslexia can also make remembering or repeating these sequences troublesome for some participants and even research team members.
  • Some capitalised or lower-case letters can be similar to numbers in appearance.

Recommendations?

Each study and trial will be different, and deciding on what is the most effective strategy to implement in your own work is a decision you and your team must make. If your study requires ethical approval these might be discussed at panel. Whilst we believe that Method 3 offers the highest level of security and non-redundancy, every method has its own pro's and con's which must be considered by your research team.

If you are collecting data through surveys, questionnaires, or as part of clinical or nutritional trials or public health research, get in touch to learn more.

Use the contact form here or email us at hello@trialflare.com