ChatGPT and other LLMs (Large language models) have taken the world by storm. As a result, compliance and information security teams have been overwhelmed by this modern technology. The usage of ChatGPT may have an impact on privacy regulations such as GDPR (General Data Protection Regulation), compliance and information security in general. Recent misuse of these tools caused data breaches in multiple corporations, for example, Samsung [1] suffered a data breach where crucial trade secrets were leaked to OpenAI’s ChatGPT.  

With the default setting of ChatGPT you run into the most risks for two key reasons. Firstly, OpenAI will store all content (prompts and responses) it receives to improve its models. This means that all data fed into ChatGPT will remain on their servers forever. This is something that should be avoided when working with sensitive data such as company secrets or personal information. OpenAI’s policy from March 2023 indicated that API users must opt-in to share data to train or improve its models, while non-API services such as ChatGPT requires users to opt-out to avoid having their data used [2].  

A second risk is specifically for non-US based companies since all content is processed and stored in the US. This is especially problematic when processing personal data. A recent ruling stated that US Surveillance practices caused insufficient data protection of residents of the EU, thus making the transfer of personal data to the US unlawful. 

Some other potential risks of the usage of ChatGPT when it comes to compliance includes: 

Risk 1 – Inaccurate answers

The most common issue with ChatGPT and other LLM tools is a tendency to provide incorrect and inaccurate information. ChatGPT says, “ChatGPT will occasionally make up facts or “Hallucinate” outputs. If you find an answer is unrelated, please provide that feedback by using the “Thumbs Down” button.” This is called “Reinforcement Learning from Human Feedback” or RLHF for short, and it is equally important to provide positive feedback when the outcome from ChatGPT is the correct one. The ChatGPT version is different than that of the API when it comes to RLHF in practice. 

Legal and compliance leaders should issue guidance that requires employees to review any output generated by ChatGPT for accuracy, appropriateness, and actual usefulness before being accepted. 

ChatGPT is trained on a large amount of internet data that may include copyrighted material. Therefore, it is outputs have the potential to violate copyright or IP protections. As the “Privacy policy” of ChatGPT says, they may collect personal information (this might also include your intellectual property), use it for improving their services and disclose it to other affiliates and vendors without notifying you. Employees and organisations should not solely rely on the intellectual property (IP) and copyright rules of OpenAI or similar models but follow IP and corporate policies that supersedes it., For example, you should not share any confidential or sensitive information with ChatGPT and other LLM models. 

Here’s what ChatGPT’s privacy policy said as of 27 April 2023: 

Legal and compliance leaders should keep a keen eye on any changes to copyright law that applies to ChatGPT output and require users to scrutinize any output generated to ensure it does not infringe on copyright or intellectual property rights.  

Risk 3 – cyber attacks

Bad actors may misuse ChatGPT to generate false information and dupe it to write malicious code. AI (Artificial Intelligence) can also generate Phishing frauds; ChatGPT itself can be hacked and its behaviour could be altered, and this applies toaAll missing zero-day exploits and data leaks.  In March 2023 OpenAI shut down ChatGPT temporarily after receiving reports of a bug that allowed some users to see the titles of other users’ chat histories. This will likely happen again given how much ChatGPT has grown in popularity since it was launched.  

Leaders need to equip their IT teams with tools that can determine what is ChatGPT generated vs. what has been generated by humans and should be geared specifically toward incoming “cold” emails. 

Risk 4 – consumer protection risks 

Businesses that fail to disclose ChatGPT usage to consumers (e.g., in the form of a customer support chatbot) run the risk of losing their customers’ trust and being charged with unfair practices under various laws, such as CCPA and GDPR. For instance, the California chatbot law mandates that in certain consumer interactions, organizations must disclose clearly and conspicuously that a consumer is communicating with a bot [6]. 

Legal and compliance leaders need to ensure their organization’s ChatGPT use complies with all relevant regulations and laws, and appropriate disclosures have been made to customers. 

How do you address some of these risks?

  1. Usage of monitoring tools to track & audit the usage of ChatGPT across the organization. 
  1. A more compliance and privacy friendly way of using GPT models is to use OpenAI’s APIs (application programming interfaces) directly, however in the corporate world this needs careful monitoring. Here is what the OpenAI terms of use says:  

The usage of ChatGPT through an API is the same as a ChatGPT UI prompt except with an API the user would need API tools such as CURL, Postman or SoapUI. Some other ways you can address some of these risks include:

  1. If the user has accidentally entered sensitive or proprietary data into ChatGPT, there are ways to prevent further damage such as deleting a particular chat conversation, an opt-out content to be used for any AI training purposes or worst case, deleting the account completely. This would delete the account and all the data entered. Organizations that utilize generative AI need to be aware of the complexities of erasing data when requested to do so, as it requires a thorough understanding of how AI systems interpret and generate responses [8]. 
  1. If you are a company that processes personal data as defined in GDPR or CCPA, execute the data processing addendum suggested by OpenAI [9].
  1. Your employees are still your first line of defense. Consistent and continuous security awareness programs are imperative to organizational data privacy best practices. Although there is cause for concern about external agents exploiting ChatGPT, it is equally important to train internal users on how to use AI responsibly and respond to threats accordingly. 

A few areas that should also be addressed in security awareness training includes: 

  1. Types of sensitive client/personal data – these data types are often concerned with all sensitive and pertinent information that an organization holds including data relevant to their clients and operations. It often includes, but is not limited to, intellectual property, strategic plans, financial information, client-specific data and other information that could potentially harm an organization or their clients if it is misused or exposed in any way. 
  1. How not to use ChatGPT and examples of sharing sensitive data through it 
  1. How to anonymize sensitive data  
  1. What the legal implications are of using sensitive client data in ChatGPT 

Additional ChatGPT usage guidelines for employees should include: 

  1. Data minimization: limit the data processed by LLMs to only what is necessary for a particular task. Anonymize, pseudonymize, or remove sensitive information before feeding it into the model. 
  1. Content filtering: this can apply both to the input data sent to an LLM service or the output which is sent back. The output data should be regularly or automatically checked for the format, correctness (if possible), and potentially damaging or offensive content (if generating free text). Filtering the input data can be done with techniques like those mentioned in the next section. 
  1. Local deployment: there are some LLMs which can be deployed locally or in a private cloud. This reduces the risk of data leaks and ensures greater control over data storage and access.  

By implementing these measures, organizations can mitigate the compliance risks associated with LLM’s and generative AI and promote responsible and secure usage of the technology. 

AI is your friend: embrace this new and exciting technology  

While the use of AI models such as ChatGPT might cause some concern, AI technologies are both scary and exciting. As ChatGPT itself says – “As an AI language model, ChatGPT is a tool that can be used for both positive and negative purposes. It is important to recognize that while it has the potential to revolutionize the way we interact with technology and each other, it also has limitations and ethical considerations. Whether we fear or embrace ChatGPT depends on how it is developed, deployed, and used”.  

On the other hand, LMMs open a plethora of new opportunities as well as being powerful with evaluation tasks due to the nature of how they are trained. Some of these include: 

  • Low hanging activities such as writing blog posts or other written pieces of content, accelerating code development, summarizing text, enhancing research, and analyzing text for themes. 
  • ChatGPT plug-ins – these could be developed to extend its core capabilities and customized for a given industry or segment of customers.  
  • Existing apps could be expanded by integrating with ChatGPT, for example, with the Expedia plugin ChatGPT can access travel data, allowing it to answer user queries about flight availability, hotel bookings, and vacation packages. This helps travel agencies streamline their customer service and offer personalized recommendations.  

Azure open AI service is another example, it has the same capabilities as OpenAI’s ChatGPT, and is not limited to just ChatGPT 3.5. 

  • Organizations can build their own proprietary and local LLMs on proprietary data. This approach has a high potential for providing a strategic advantage, but it is currently technically difficult requiring highly specialized skills and knowledge. Building an LLM model isn’t easy, but fine-tuning an open-source model is easier based on the proprietary data they hold. 
  • The creation of new job market and roles such as prompt engineers, machine learning engineers, ethics and privacy specialists that work with large language models. Many more job opportunities will be created by using ChatGPT-like applications. 

Final thoughts 

Understanding and navigating the compliance risks of ChatGPT is crucial for organizations and individuals alike. Compliance is an ongoing process, and as this powerful language model continues to shape various aspects of our lives, it’s important to recognize the potential risks and take proactive measures to mitigate them.

Sources

[1] https://help.openai.com/en/articles/6783457-what-is-chatgpt

[2] https://openai.com/policies/privacy-policy

[3] https://platform.openai.com/docs/api-reference/introduction

[4] https://help.openai.com/en/articles/6378407-how-can-i-delete-my-account

[5] https://openai.com/policies/terms-of-use

[6] https://www.zscaler.com/blogs/product-insights/make-generative-ai-tools-chatgpt-safe-and-secure-zscaler

[7] https://www.gartner.com/en/newsroom/press-releases/2023-05-18-gartner-identifies-six-chatgpt-risks-legal-and-compliance-must-evaluate

[8] https://www.ml6.eu/blogpost/the-compliance-friendly-guide-to-using-chatgpt-and-other-gpt-models

[9] https://securityintelligence.com/posts/using-chatgpt-as-an-enabler-for-risk-and-compliance/

[10] https://www.themaryword.com/post/should-we-fear-or-embrace-chatgpt

[11] https://www.forbes.com/sites/forbestechcouncil/2023/05/15/the-strategic-opportunities-of-advanced-ai-a-focus-on-chatgpt/?sh=2bba8c893f46 [11]

[12] https://www.theverge.com/2018/6/27/17510908/apple-samsung-settle-patent-battle-over-copying-iphone

[13] https://medium.com/@jakairos/the-tipping-point-chatgpt-plugins-create-new-opportunities-and-crush-dreams-1027bc1016f3

quality engineering free assessment