AI & Privacy Protection: Addressing The Need For An AI Confidentiality Privilege

Imagine you’re using ChatGPT to help you diagnose a medical issue you are having, giving the model personal details regarding the ailments you are experiencing. Or perhaps you are using it to get over a traumatic moment in your life, disclosing intimate details of your psyche, past experiences, and feelings to help you cope with the intense feelings you are experiencing. Now let’s fast forward, and say you’re involved in legal trouble, could be civil or criminal, and those submissions to ChatGPT are deemed to be relevant enough to be discoverable. Meaning that a lawyer and their team are allowed to sort through those intimate details you professed to ChatGPT and use them as evidence in court to help meet the necessary legal standard to obtain a conviction or money judgement against you.

Sam Altman, the CEO of Open AI, has acknowledged these privacy shortfalls. While on a podcast with Theo Von he said “People talk about the most personal shit in their lives to ChatGPT… People use it- young people, especially, use it as a therapist, a life coach; having these relationship problems and asking ‘what should I do?’ And right now, if you talk to a therapist or a lawyer or a doctor about those problems, there’s legal privilege for it. There’s doctor-patient confidentiality, there’s legal confidentiality, whatever. But we have not figured this out for ChatGPT”.

Mr. Altman has a point, there’s a significant gap in confidentiality when it comes to AI usage. In most U.S. jurisdictions, communications between a patient and a physician for the purpose of medical diagnosis or treatment are privileged meaning they cannot be disclosed without the patient’s consent, even if they meet hearsay exceptions in some instances. The same goes for communications with a therapist, the reasoning behind the need for confidentiality is to allow a therapist’s patients to feel free to open up about their feelings in order to help with their treatment. A similar exception also exists for priest and penitent relationships, protecting communications aimed at aiding people seeking spiritual guidance. However, under the current legal milieu there is not a widely recognized  carve out for submission into an Artificial intelligence model, and this omission is a cause of concern for consumers.

Realistically, not all of your conversations with an LLM model such as ChatGPT are entitled to confidentiality, after all you are inputting information into a database that is constantly monitored to improve the model, and to flag any misuse of the platform. However, that doesn’t mean some of your conversations should not be protected. It is vital that a legal framework is developed that helps guide AI developers, lawyers, and AI consumers regarding this vital impasse of privacy . The legislation would ideally address several issues such as copyright, intellectual property, consumer protection, and establishing transparency and accountability measures. But arguably the most important issue to address would be the use of confidential information regarding medical, religious, and psychological advice in court rooms.

Key Issues the AI Confidentiality Law Should Address

AI models that require inputs from users, such as ChatGPT, Claude, or Grok, should allow users to opt in to create a confidential channel of communication. Laws could be drafted to encourage AI service providers that offer conversational interfaces to provide users with a clearly marked “Confidential Communication Mode” that, when activated, triggers enhanced legal privacy protections equivalent to those afforded to traditional privileged communications. This feature must be easily accessible and explained in plain language to users.

The law should also address the need for affirmative consent to confidential mode activation through a multistep verification process that includes: (a) acknowledgment of the confidential nature of the communication, (b) understanding of the limitations and scope of protection, and (c) explicit consent to the creation of privileged communication records. Additionally, the user would have to agree that the use of the confidential mode activation would be used in good faith, in other words intended for appropriate scope of confidentiality regarding medical advice, religious confessions, and psychological therapy. Otherwise, bad actors could simply use the feature for illicit purposes not meant to be protected. The companies could expand the scope of the confidentiality mode to address other subjects, but that would not entitle communications that are not medical, religious, or psychological in nature from protection under the proposed law. However, while in confidential mode and despite not receiving the same privileged protection, users could still find value in confidential mode because the law would also address how the data is handled by the company.

For example, the law should also stipulate that communications designated as confidential must be stored in segregated, encrypted databases separate from general training data. This is similar to the Illinois Biometric Information Privacy Act (BIPA), which regulates the collection and storage of biometric data obtained from consumers The confidential communications could be used for model training in a more limited capacity for improvement. This added protection would ideally strip away all personally identifiable information to allow the models to train for improvement while maintaining user privacy. Further, like in BIPA, communications made under the proposed confidential mode must be subject to automatic deletion after a period not exceeding a year, unless the user explicitly consents to extending the retention period. Further, consumers must retain the unilateral right to delete confidential communications at any time without explanation or justification.

The scope for the AI confidentiality privileged will not be absolute.  Rather it will be limited similar to any other confidentiality privilege, with exceptions only for things such as: (a) imminent threat of harm to self or others, (b) child abuse reporting requirements, and (c) court-ordered disclosure following in-camera judicial review demonstrating compelling need and lack of alternative sources. This requires AI companies to rigorously flag any communication that falls below the aforementioned standards.

Further, any attempt to obtain confidential AI communications through legal discovery must meet a “clear and convincing evidence” standard demonstrating that: (a) the information is essential to the legal proceeding, (b) no alternative sources exist, (c) the probative value substantially outweighs privacy concerns, and (d) less invasive means of obtaining the information have been exhausted.

This legislative framework should also require a phased implementation approach, with major AI service providers having roughly 3 years from the establishment of the law to develop and deploy confidential communication capabilities. The legislation should include provisions for regular review and updates to address evolving technology and emerging privacy concerns, which is key in an ever evolving field.

This proposed AI law may well balance the legitimate need for AI’s development while balancing users’ fundamental privacy rights, creating a framework that recognizes the unique role AI is beginning to play in personal healthcare and psychological support while establishing meaningful legal protections comparable to traditional professional relationships.

Sources:

-A Brief History of Information Privacy Law
Daniel J. Solove

-The Rules of Federal Procedure

-This Past Weekend Episode 59 by Theo Von

Artificial Intelligence & American Copyright Law: Analyzing the Copyright Office’s AI Report

Copyright Office’s AI Report: The Good, The Bad, and The Controversial

The Copyright Office just dropped Part 3 of its AI report, which aimed at addressing certain copyright law in regards to Artificial Intelligence. The thing that’s got everyone talking is the fact that the report was supposed to tackle infringement issues head on, but instead teased us by saying that answer will come up in “Part 4” that is expected to be released at a later date. Let’s dive into what was actually discussed.

Legal Theory: A Case by Case Basis

The report’s central thesis is a pretty straightforward legal theory. Basically, they recommend that there will be no blanket rule on whether training AI on copyrighted content constitutes infringement or fair use. Everything gets the case by case treatment, which is both realistic and frustrating depending on where you sit. That’s because most lawyers like clear bright line rules backed up by years of precedent, but when attempting to make legal frameworks regarding emerging technologies, the brightline approach is easier said than done.

The report acknowledges that scraping content for training data is different from generating outputs, and those are different from outputs that get used commercially. Each stage implicates different exclusive rights, and each deserves separate analysis. So in essence, what’s  actually useful here is the recognition that AI development involves multiple stages, each with its’ unique copyright implications.

This multi stage approach makes sense, but it also means more complexity for everyone involved. Tech companies can’t just assume that fair use covers everything they’re doing and content creators can’t assume it covers nothing. The devil is in the details.

Transformative Use Gets Complicated

The report reaffirms that various uses of copyrighted works in AI training are “likely to be transformative,” but then immediately complicates things by noting that transformative doesn’t automatically mean fair. The fairness analysis depends on what works were used, where they came from, what purpose they served, and what controls exist on outputs.

This nuanced approach is probably correct legally, but it’s also a nightmare for anyone trying to build AI systems at scale. You can’t just slap a “transformative use” label on everything and call it a day. The source of the material matters, and whether the content was pirated or legally obtained can factor into the analysis. So clearly purpose also matters since commercial use and research use will likely yield different results in the copyright realm. Control and mitigation matter in this context because developing the necessary guardrails is paramount to preventing direct copying or market substitution.

Nothing too revolutionary here, but the emphasis on these factors signals that the Copyright Office is taking a more sophisticated approach than some of the more simplistic takes we’ve seen from various opinions on this matter. This should be reassuring since a one size fits all approach at such an early stage of developing AI could stifle innovation. However if things are left to be too uncontrolled copyrighted works may face infringements to their copyright.

The Fourth Factor Controversy

Here’s where things get interesting and controversial. The report takes an expansive view of the fourth fair use factor: which is the effect on the potential market for the copyrighted work. That is because too many copyrighted works flooding the market brings fears of market dilution, lost licensing opportunities, and broader economic impacts.

The Office’s position is that the statute covers any “effect” on the potential market, which is broad interpretation. But that broad interpretation has a reason, they are worried about the “speed and scale” at which AI systems can generate content, creating what they see as a “serious risk of diluting markets” for similar works. Imagine an artist creates a new masterpiece only to get it copied by an AI model which makes the piece easily recreatble by anyone, diluting the value of the original masterpiece. These types of things are happening on the market today.

This gets particularly thorny when it comes to style. The report acknowledges that copyright doesn’t protect style per se, but then argues that AI models generating “material stylistically similar to works in their training data” could still cause market harm. That’s a fascinating tension, you can’t copyright a style but you might be able to claim market harm from AI systems that replicate it too effectively. It is going to be interesting to see how a court applies these rules in the coming future.

This interpretation could be a game-changer, and not necessarily in a good way for AI developers. If every stylistic similarity becomes a potential market harm argument, the fair use analysis becomes much more restrictive than many in the tech industry have been assuming.

The Guardrails

One of the more practical takeaways from the report is its emphasis on “guardrails” as a way to reduce infringement risk. The message is clear: if you’re building AI systems, you better have robust controls in place to prevent direct copying, attribution failures, and market substitution.

This is where the rubber meets the road for AI companies. Technical safeguards, content filtering, attribution systems, and output controls aren’t just up to the discretion of the engineers anymore they’re becoming essential elements of any defensible fair use argument.

The report doesn’t specify exactly what guardrails are sufficient, which leaves everyone guessing. But the implication is clear: the more you can show you’re taking steps to prevent harmful outputs, the stronger your fair use position becomes. So theoretically if a model has enough guardrails they may be able to mitigate their damages if the model happens to accidently output copyrighted works.

RAG Gets Attention

The report also dives into Retrieval Augmented Generation (RAG), which is significant because RAG systems work differently from traditional training approaches. Instead of baking copyrighted content into model weights, RAG systems retrieve and reference content dynamically.

This creates different copyright implications: potentially more like traditional quotation and citation than wholesale copying. But it also creates new challenges around attribution, licensing, and fair use analysis. The report doesn’t resolve these issues, but it signals that the Copyright Office is paying attention to the technical details that matter.

Licensing

The report endorses voluntary licensing and extended collective licensing as potential solutions, while rejecting compulsory licensing schemes or new legislation “for now.” This is probably the most politically palatable position, but it doesn’t solve the practical problems.

Voluntary licensing sounds great in theory, but the transaction costs are enormous when you’re dealing with millions of works from thousands of rights holders. Extended collective licensing might work for some use cases, but it requires coordination that doesn’t currently exist in most creative industries.

The “for now” qualifier is doing a lot of work here. It suggests that if voluntary solutions don’t emerge, more aggressive interventions might be on the table later.

The Real Stakes

What makes this report particularly significant isn’t just what it says, but what it signals about the broader policy direction. The Copyright Office is clearly trying to thread the needle between protecting creators and enabling innovation, but the emphasis on expansive market harm analysis tilts toward the protection side.

For AI companies, this report is a warning shot. The days of assuming that everything falls under fair use are over. The need for licensing, guardrails, and careful legal analysis is becoming unavoidable.

For content creators, it’s a mixed bag. The report takes their concerns seriously and provides some theoretical protection, but it doesn’t offer the clear-cut prohibitions that some have been seeking.

The real test will come in the courts, where these theoretical frameworks meet practical disputes. But this report will likely influence how those cases get decided, making it required reading for anyone in the AI space.

As we can see AI and copyright law is becoming only more and more complex. The simple answers that everyone wants don’t exist, and this report makes that abundantly clear. The question now is whether the industry can adapt to this new reality or whether we’re heading for a collision that nobody really wants.