Protecting data when using AI – top tips

The use of artificial intelligence (AI) among businesses is growing exponentially – every minute US$293,000 is spent globally on AI, according to cloud software company Domo.  

It has been estimated that around 40% of businesses globally currently use AI. The use in Australia is even higher. The CSIRO reported that, in 2023, 68% of Australian businesses had already implemented AI technologies and a further 23% were planning to implement it within the next 12 months.  

The rush to embrace AI is understandable, given a survey by digital and cloud solution provider Avande found 96% of businesses believe they must transition to an AI-first operating model within a year to remain competitive and meet customer expectations. 

Adopting AI technology can deliver numerous benefits including: 

  • Increased efficiency and productivity through automation of repetitive tasks. 
  • Enhanced decision-making capabilities through analysis of vast amounts of data. 
  • Reduced costs and cost savings through automation and insights (e.g. anticipating maintenance needs). 
  • Improved customer experience and engagement through provision of personalised and efficient interactions. 
  • Encouragement of innovation through exploration of new ideas and development of cutting-edge solutions. 
  • Data analysis and insights through quickly processing large volumes of current and historical data, drawing conclusions, capturing insights, and forecasting future trends or behaviours. 

However, there are some risks (see our article). One risk of particular concern is that of data protection. A report from Lenovo found IT decision-makers were focussed on the security and quality of AI and its data. The survey identified data security as the top priority for IT professionals in Australia, with data security and control being a primary factor limiting organisations in taking up generative AI. A key concern was that gen-AI can jeopardise the business’ control of data and intellectual property assets. 

AI tools gather, store and process significant amounts of data. The way data gets captured and collected, the extraction of the data’s features (or fields and characteristics), and how data is engineered to train the model can pose risks. Consequently, it is imperative that businesses understand how data is collected, stored, and used when it comes to AI. Without the proper understanding, businesses can be exposed to numerous risks – reputational, data, liability, legal, and security.  

As the use of AI continues to grow, the need for businesses to be proactive when it comes to data protection also rises.  

To help protect data, businesses may consider: 

  • Implementing robust data security measures and governance frameworks. 

Protecting private data/sensitive information should be prioritised, especially in light of new privacy laws. Ensure that the board, management and employees are aware of the business’ privacy obligations and understand relevant data protection legislation and requirements. Inputting personally identifiable information (PII) to train an AI model may result in models that inadvertently reveal sensitive information about individuals or groups. 

Businesses should be mindful of the information they input into AI software as it may potentially be vulnerable to data breaches. Also, particularly in cases where businesses are using free versions of public models, the data inputted may be used to train the next version – which could potentially see private or confidential information being pulled out of the ether once the next iteration of the AI model is developed. 

A strong AI use policy should be developed and implemented. Businesses should also implement a strict data-handling and data-storage policy, and use a secure authentication system to protect data from malicious actors. 

With the Government looking to regulate AI and introduce mandatory guardrails, businesses should ensure they keep up-to-date and amend practices to reflect statutory obligations. Keeping abreast of the latest security trends and best practices can help to ensure the business remains protected. 

Businesses may also consider investing in paid AI tools that can offer better security and reliability than free ones. 

Continuously evaluating and updating the business’ risk profile to reflect new vulnerabilities introduced by AI is also recommended. The business’ protection measures should be robust and adaptive, with the security solutions and tools updated to defend against AI-specific threats.  

It is also important for businesses to develop and maintain robust data back-up and recovery strategies. 

  • Establishing comprehensive data quality and accuracy protocols to support AI initiatives. 

Any output from an AI tool is only as good as the data inputted into it. All information a business inputs into AI software should be vetted to ensure accuracy.  

It is also important that employees are trained in how to use the tools, best practices for using them, the potential risks of AI tools, and how to identify and report suspicious activity. Businesses should be realistic about any skills gap before they consider adopting AI.  

Human oversight is required as many AI applications have serious errors with sourcing, accuracy and accountability. Businesses should ensure that all AI-generated content is cross-checked for accuracy and to ensure the content could not reasonably be considered to be plagiarised. 

  • Safeguarding intellectual property – the business’ and that of others. 

Intellectual property (IP) needs to be respected and protected. AI uses data models that borrow from IP such as software, writing, art and music – disintermediating the owners of that IP (i.e. “cutting out the middlemen”). If Google is used to search for something, it typically returns a link to the source or the originator of the IP – this is not necessarily the case with AI. AI tools may inadvertently use copyrighted material without the content creator’s permission, resulting in potential legal consequences.  

Plagiarism is one of the most prominent risks associated with text-based AI tools. While AI technology can generate content quickly, it cannot create text that is 100% unique. Plagiarism is rife. The Domo report, for example, found 52 college papers are flagged for AI plagiarism on Turnitin every minute. If a business uses AI-generated content as its own without manual editing, it runs the risk of creating duplicate content that could be flagged as plagiarism.

Currently, works that are solely AI-generated are not covered under the Australian copyright law. A copyright and AI reference group has been established by the Government to address AI copyright issues.   

The business could have its IP infringed, or it could infringe that of another. Inputting proprietary data needs to be carefully considered by the business. One option may be to use RAG (retrieval-augmented generation) within its owned, secure server environment. This allows the business to keep all its proprietary data secure and free from the public domain, and use it as context in the business’ prompting and outputs. Any information generated via AI should be carefully scrutinised and verified by the business prior to use to ensure the IP of another is not being infringed

  • Using software and security tools to safeguard against AI security risks. 

Businesses should implement strict access controls. Limiting data access to only those who need it is one of the best ways of reducing the risk of accidental data exposure and the sharing of sensitive data in AI training models. 

As there are many stages in the data life cycle (ingestion, analytics, sharing and storage), businesses should ensure that each stage is protected using the right methods – masking, tokenising or encrypting. 

Where possible, businesses should use secure-by-design AI tools and services. 

There are several security tools available for businesses to consider. Some tools are designed to detect AI model poisoning (where malicious data or code infiltrates an AI system) and other malware patterns, others provide secure authentication services, multifactor authentication, and encryption while also protecting against unauthorised access, data loss and the injection of malicious code. There are also secure text classification tools that can detect whether content was generated by AI or a human. 

Securing and protecting data is crucial for sustaining trusted AI operations. Despite best protection efforts, businesses using AI can be exposed to a range of risks. No single insurance policy will cover all potential exposures from adopting AI. Talk to an EBM cyber specialist about your risk exposures, risk management and mitigation strategies, and the insurance covers that can help to protect your business.