The Essential Online Investigation Guide for Websites, Social Media, and Team Collaboration Tools

How to Collect and Preserve Online Evidence on Websites, Social Media Platforms, and Enterprise Social Networks

Want this guide as a downloadable PDF?

We spend an astounding amount of time online—we chat, shop, share, bank, and conduct research that leads directly to major life decisions. 

Just consider the following: At the moment, there are around 3.84 billion active social media users worldwide, including:

active Facebook accounts

active Instagram accounts

active Twitter accounts

active LinkedIn accounts

active Snapchat accounts

For legal teams, all this online activity has profound implications. According to a survey conducted by Robert Half Legal in 2017, 52% of lawyers had reported an increase over the past two years in lawsuits involving posts, images, and data found on social media.

In 2018, a survey conducted by the International Legal Technology Association (ILTA) revealed that 90% of law firms had conducted social media discovery that year. It also showed that the number of firms handling at least 20 matters involving social media had increased by 46% year-over-year.

And then there are team collaboration tools like Slack and Workplace from Meta. As high-profile news reports have shown, enterprise collaboration conversations often play a central role in employee-related disciplinary and legal matters.

eDiscovery vendor Logikcull states, “if your discovery process ignores Slack, you’re missing half the conversation.”

With team collaboration tools increasingly replacing email as the main form of internal communication, legal teams need to consider how to collect and preserve this data for litigation.

This guide will examine the challenges that come with the eDiscovery of social media and enterprise collaboration data and also provide solutions for effective collection.


The New Challenges of Discovery

When it comes to modern organizations, traditional litigation-related discovery has been supplanted almost completely by eDiscovery. In other words, legal teams are dealing with far more electronically stored information (ESI) than they are with physical documents.

In fact, the average civil case now contains 6.5 million digital pages (130 GB of data). And unlike the physical documents of old, this data can be difficult to collect, preserve, and assess.

It’s not only the sheer volume of information; online data like social media content and enterprise collaboration conversations also pose unique challenges that need to be considered.

Multifaceted Nature

Unlike other forms of content, these data sources can often contain videos, images, comments, and likes. For this reason, it’s important to find a solution that captures all this content effectively.

Deep-Linked Content

A good 30% of social media messages contain links to third-party website content like GIFs, videos, and articles. For legal teams, being able to see this shared content in context is useful but sometimes tricky. When studying a CSV export of this data, for instance, the relevance of a link can be overlooked.

Ever-Evolving Platforms

Although social media networks like Twitter, Facebook, and Instagram may have much in common (like the ability to use hashtags), they are also all unique platforms with their own structures and capabilities. A common issue with the collection of content is that these platforms are constantly evolving, which can instantly make existing collection methods obsolete. For example, many legal teams relied on API tools to collect social media evidence on Facebook accounts that they did not own the credentials for, but when the company revoked API access on April 4, 2018, these tools suddenly became useless.

Real-Time Activity

The beauty of social media is the speed at which it operates, but it can also make evidence collection tricky. When new posts and comments are constantly appearing, trying to understand all the content and identifying what is relevant can feel overwhelming. You could easily find that a post is edited, or new comments keep appearing, even as you’re trying to collect a piece of social media evidence.

Capturing Content Before It Is Deleted

There is no guarantee that a relevant social media post will still be available online a day, hour, or even a minute from now, which is why it is often crucial to capture and preserve online evidence as soon as it is discovered.

If it’s only a single webpage or social media post that has to be collected, this may not be a problem, but when it’s an entire website or social media account that has to be captured, doing it before important evidence disappears can be difficult.

Creating Defensible Digital Evidence

Collecting online evidence is one thing, ensuring that it complies with court rules for digital evidence and has a clear chain of custody is quite another.

For example, it might be quick and easy to grab a screenshot of a social media post as a way of collecting evidence, but proving authenticity and integrity can be difficult. Since it’s an easy process to alter a JPEG or PDF, legal teams need to consider how they will prove that content accurately reflects what was displayed online.

Efficient, Cost-Effective Evidence Collection

Undoubtedly, the biggest challenge legal teams face is finding an efficient, cost-effective way to capture online evidence, especially when one adds the factors that (i) it has to be done in a timely manner and (ii) it has to stand up to the demands of digital evidence.

Traditional methods like screenshotting can take an inordinate amount of time when legal teams deal with large websites and very active social media accounts. A social media account, like a Facebook page or an Instagram account, can be particularly frustrating, since capturing everything requires scrolling through endless timelines and expanding hundreds (if not thousands) of comments and replies.


Evidence Collection and the Social Media Environment

Now that we’ve outlined some of the challenges that come with the collection of online evidence, let’s examine what these challenges look like practically when it comes to social media. Once we’ve done that, we’ll turn our attention to enterprise collaboration platforms in the next section.

The Evolution of Evidence

The public nature of social media makes it very different from most other sources of evidence. On the positive side, the fact that content on platforms like Twitter, Facebook, and Instagram is often publicly accessible gives investigative teams a great opportunity to capture crucial evidence simply by examining a public profile and collecting what they want. On the downside, this evidence is often changing and evolving in real-time.

For instance, just consider a photo posted on Facebook. That post could see hundreds of comments, many of which could have relevance to the legal matter at hand. Then you have to consider that the post could be edited or deleted—and the comments underneath it could also be edited and deleted. So how do you collect and preserve evidence when it’s constantly changing? And what do you do if that crucial bit of evidence suddenly disappears from the timeline?

Decoding Social Media Evidence

As mentioned in the previous section, the multifaceted nature of social media is another important thing to consider. Social media is not one particular kind of media—it consists of text, photos, videos, GIFs, comments, emojis, likes, shares, etc.

Successfully collecting all relevant content is difficult enough, but that’s only half the battle. There will come a time when this information needs to be assessed and reviewed, and when trying to do this with stitched-together PDFs and CSV spreadsheets, it can be very difficult.

Just consider the humble emoji. These are increasingly finding their way into legal matters, but decoding emojis during eDiscovery can be difficult. Their meaning is incredibly dependent on context, but how do you effectively retain context and intent when collecting social media evidence? The answer to that question is provided in Section 4 of this guide.


Evidence Collection “Where Work Happens”

The tagline for Slack is “where work happens.” And it’s not an exaggeration. Team collaboration tools like Slack, Workplace from Meta, and Microsoft Teams have fundamentally changed the way many organizations operate; internal emails have dwindled as employees increasingly communicate, collaborate, and share over these platforms. Many organizations see email use reduced by 30% or more when adopting a team collaboration tool.

In addition to cutting down on distracting emails, enterprise collaboration platforms also have other productivity benefits. For instance:

  • They can help reduce task switching and save time. By combining a tool like Slack with something like Google Docs, it becomes much easier to communicate and collaborate. Instead of first having to discuss changes to a document through email and then try to apply those changes in the doc, collaboration tools allow the entire process to be seamless.
  • Collaboration tools allow employees to focus while still ensuring that everyone is looped in. Channels and groups are great tools for managing projects and communicating without bothering everyone with constant emails. “Slack is business done right. When you have collaboration happening in one spot, leadership doesn’t need to be copied on an email. You can hop into a Slack channel, cruise along with the project and jump in where needed,” says Benjamin Sternsmith, Vice President of Sales for Lyft Business.
  • Enterprise collaboration platforms improve company culture. When it comes to culture and employee engagement, open communication is key. Tools like Slack and Workplace from Meta can be used to connect employees, create a company-wide connection to leaders, celebrate success, and even gauge employee sentiment.

The ROI of Enterprise Collaboration Tools

How significant are the benefits that enterprise collaboration tools offer? According to many experts, the benefits are substantial—but not enough organizations have adopted these tools yet.

“The most powerful applications of social technologies in the global economy are largely untapped. By using social technologies, companies can raise the productivity of knowledge workers by 20 to 25 percent,” writes McKinsey in a report titled The social economy: Unlocking value and productivity through social technologies

Workplace from Meta has also commissioned an impact study of enterprise collaboration software that found that these tools can offer an ROI of 400%.

Forrester found that Workplace from Meta could offer:

  • 34% reduction in the time it takes to update frontline workers
  • 20% increase in task efficiency
  • 80% reduction in cloud operating expenses for legacy tools
  • 20% faster decision-making
  • 32% increase in product innovation
  • 24% increase in onboarding efficiency

Enterprise Collaboration and Information Governance

Another important benefit of these platforms is that they can aid in improving information governance within an organization.

A great way of maintaining good information governance while also facilitating collaboration is to implement a team collaboration tool. Platforms like Workplace from Meta, Slack, and Microsoft Teams can act as a hub that connects employees, while also providing information governance teams with a central hub for shared data and communication.

“When we say that Slack is a collaboration hub, we don’t just mean people sending messages to one another, but more broadly, the work enabled across teams and the many business systems, data and applications that power productivity for our customers around the world. When all of these elements come together in Slack, that’s when we truly deliver as a collaboration hub,” writes Slack.

By plugging file management solutions, calendars, productivity apps, and web conferencing software into an enterprise collaboration hub, an organization can gather disparate remote tools and make the job of information governance professionals much easier—all while empowering its workforce in the process.

Finding Hidden Records

While enterprise collaboration tools can be great for pulling disparate records together, it’s important to keep the governance implications in mind.  Like email and social media, enterprise social networks are responsible for large volumes of data within enterprises, and this data can be incredibly valuable yet complex to manage. Companies need to carefully consider the legal and data governance requirements of any enterprise collaboration tools they implement. As with email, enterprise social networks can be used for inappropriate conversations and unsanctioned data sharing.

“Legal teams are used to documents. Not chat rooms. But chat rooms are taking over,” writes Logikcull in a guide on discovery and investigations in Slack. “In one survey, nearly 20% of companies who adopted Slack saw their email use decline by 40 to 60%. Today, if you’re only dealing with emails, you’re missing half the story… With Slack, users can direct message, create chat rooms, share files, edit—or, depending on the context, spoliate—Slack messages from the past, and more.”

As enterprise social networks continue to supplant email and become the hubs that facilitate internal communication and collaboration, organizations need to implement these tools in a way that ensures they retain a tight hold on data.

To do this successfully, companies need to be able to:

  • Monitor activity in enterprise collaboration tools to identify inappropriate conversations
  • Put a data loss prevention (DLP) strategy in place that prevents the sharing of sensitive data
  • Collect and preserve all data in real-time to facilitate compliance and litigation
  • Place flagged data on legal hold to prevent disposal as part of regular retention scheduling


How to Collect Website & Social Media Evidence

Now that we’ve examined some of the inherent challenges that come with managing and collecting website, social media, and enterprise collaboration content, we can look at how this information is collected in practical terms.

In order to do this, we need to distinguish between company-owned sources and third-party sources. Company-owned sources are the websites, social media accounts, and enterprise collaboration tools that are under the control of the organization. Third-party sources are sites and accounts owned by other companies and individuals.

It goes without saying that collecting and preserving evidence within data sources that you own is much simpler than dealing with external sources. In fact, with the right tools, preservation can largely be automated.

With third-party sources, investigators need to spend time and effort to identify relevant evidence — and they are often limited to what is publicly visible. Exporting evidence in a way that proves authenticity can also be harder.

With the above in mind, let’s look at some of the considerations that come with collecting evidence from third-party sources.

Collecting and Preserving Third-Party Evidence

Consider All the Sources

When it comes to looking for online evidence, Facebook and Instagram tend to be the most useful platforms, but there are many others worth considering. In fact, there are around 200 widely-used social media sites at the moment, so if you’re limiting your investigation to the top two, or three, you could be missing out on crucial evidence. Just consider the December 2019 case, during which a claim of serious physical injury was proved false with posts of a 10-mile run and a 20-mile bike ride on the fitness-oriented social media platform Strava.

Consider Someone’s Social Connections

While an individual being investigated might be clever enough to utilize strict social media privacy settings and refrain from posting incriminating content, they probably won’t be able to keep all their activities hidden. As mentioned earlier, a bystander might upload an incriminating video to YouTube. Similarly, someone’s friends, family, roommates, team mates, or colleagues might also post useful images and information. Because of this, it’s worth exploring the accounts of people in an individual’s social circle, as well as any pages belonging to an employer, association, sports team, etc.

Use Tools to Find Online Profiles

A simple Google search remains a good way of finding a particular individual’s online profiles, but other excellent tools also exist. Search tools like Pipl, Peoplefinders, PeekYou, and Classmates can all be used to identify social media profiles. If you have an image and would like to see where it appears online, TinEye is another great tool.

Always Obtain Evidence Legally

It’s important to stay on the right side of the law. While law enforcement might sometimes be able to create fake social media profiles to investigate suspects, law firms and fraud investigators don’t have that same freedom. And even API tools that were once very useful for collecting social media evidence are now creating severe preservation challenges thanks to privacy concerns. So, when collecting social media evidence, focus only on content that you can view and capture legally.

Make Sure Evidence Is Defensible

Collecting incriminating evidence is only half the battle—legal teams also need to be able to convince other parties of the information’s authenticity. While taking a simple screenshot might seem like a quick and easy way of collecting evidence, it’s all too easy for the person under investigation (or their legal counsel) to question the quality of that content. Instead, investigators should opt for a capture method that furnishes evidence with a hash value that authenticates data and collects all the associated metadata of a social media post.

Use the Right Preservation Tools

Related to the points above, it’s important to make use of a preservation tool that not only allows evidence to be captured legally, but also provides defensible evidence that proves the authenticity of a record. It is with this in mind that we created the browser-based evidence tool called WebPreserver. The video below summarizes how WebPreserver works.

With WebPreserver, legal and investigations teams can:

  • Capture online evidence from social media accounts and websites with two simple clicks
  • Expand their social media discovery—capture entire Facebook or Twitter timelines
  • Export content as searchable PDFs and search timelines for relevant keywords
  • Capture video with ease—quickly collect and authenticate videos from Facebook, Twitter, YouTube, and Instagram
  • Store captures directly to their computers, maintaining full digital chain of custody during collections

Collecting and Preserving Company-Owned Records

As mentioned in the introduction of this section, the collection and preservation of company-owned sources is easier — especially if an organization makes use of an automated solution.

Companies shouldn’t be relying on screenshots to capture this data, nor should they be depending on CMS backups, or the inherent archiving capabilities of platforms like Facebook, Twitter, and Instagram.

Any implemented system or solution should:

  • Automatically collect and preserve website and social media content
  • Make it easy for in-house legal and investigative teams to access this data
  • Provide easy export of content in a defensible format
  • Allow data to be placed on legal hold

Social Media Collection

Organizations should be leveraging a solution that has API integrations with platforms like Facebook, Instagram, and Twitter. This ensures that data is collected in real-time and that all changes, deletions, and linked content are collected. Without an API integration that allows for real-time collection, there’s a high likelihood that crucial changes and communications would be missed, and that archives will consequently be incomplete. With API integration, there’s also the added benefit of being able to archive content retroactively — as long as the data is still available on the original platform, it can be collected and placed in an archive.

Website Collection

When dealing with websites, data should be crawled regularly to capture all additions, edits, and deletions across a site. Depending on how often website content is updated, it would typically be crawled once per day or once per week. Crucially, any solution that’s put in place should be capable of dealing with the latest complex sites. It should be able to capture client-side generated web pages by Javascript/Ajax frameworks, including Ajax-loaded content. It should also be capable of collecting multiple steps in web form flows, and capture webpage content that is displayed after a user event (if a section on a webpage loads additional content using Ajax after a user clicks). 

Preservation of Content

Once information has been captured, part of the preservation process is placing that data in an archive. What differentiates an archive of online data from a basic back-up is the fact that properly archived records are indexed, meaning that the content is compiled in a way that makes it easy to search. So when the legal team needs to find a specific record, all that’s required is a simple search and not a labor-intensive trawl through thousands of files. Properly indexed data also maintains relationships between data and users (allowing for the posts and comments of a specific user to easily be identified), and even allows metadata to be searched.

Discovery and Legal Hold

It’s important that website and social media content be easily searchable, exportable, and processable for legal purposes—and that it can be ingested by eDiscovery platforms. The ability to place a legal hold is another important consideration. Data doesn’t stay in an archive forever. Organizations can be expected to retain official records for anything from three to 10 years, and once that retention period is reached, information is often deleted. However, if the data is needed for legal purposes, this should be overridden to ensure that evidence isn’t lost. Any archiving solution should therefore enable the organization to easily place a page, post, or conversation on legal hold to prevent (intentional or unintentional) spoliation of evidence and preserve it for litigation.

We offer solutions with the above requirements in mind. Visit one of the pages below to learn more.


How to Collect Evidence in Team Collaboration Tools

To understand the legal challenges of modern enterprise collaboration platforms, it’s useful to compare it to email. Although it is hard to imagine today, there was a time when organizations were not entirely sure how emails should be stored and managed to meet litigation needs. As the technology evolved, and key regulators started to hand down specific rules and guidance, companies slowly understood what was required of them and implemented robust systems and processes.

Today, just about every company understands that employee emails have to be retained for compliance and litigation, and subsequently have some sort of email vault or other archiving solution in place.

The eDiscovery Challenges of Team Collaboration Tools

A team collaboration tool like Slack, Workplace from Meta, or Microsoft Teams is similar to email as far as preservation requirements for compliance and litigation goes, but the data is more complex. Unlike emails, which are discreet and sequential, communication and collaboration in a tool like Slack or MS Teams is dynamic and real-time. Not only do employees share files and communicate constantly, but messages can also be edited and deleted, so what you see in a channel or direct conversation right now is not necessarily what appeared a day, a week, or a month ago.

Because of the dynamic nature of enterprise collaboration content, as well as the sheer amount of data that is created every single day, manual searches and screenshots are not sufficient eDiscovery solutions.

Streamlining Legal Investigations in Team Tools

The first step in managing the eDiscovery of a team collaboration tool is setting correct retention settings. Team collaboration tools allow you to set retention periods for channels and conversations — Slack, for instance, retains all messages for the lifetime of a workspace by default. You want to make sure that these settings align with the retention periods of your larger organization. You might not want to retain messages forever, but you also do not want to delete data too quickly, leaving the legal team unable to retrieve these records.

The next step to facilitating legal investigations within a team collaboration tool is leveraging the eDiscovery integrations that these platforms offer. Why not simply make use of the retained data within the platform itself? The main reason is that this can be difficult (if not impossible) to export content in a format that is defensible and easy to submit during a legal matter.

Far better is an eDiscovery solution that is designed to plug into a collaboration platform to make content easy to search and export. It has to be added that not all eDiscovery products are created equal, but the best solutions will create a database of saved records that retains the native look and feel of the collaboration platform, as well as the context and relationships of data and users.

The video below illustrates how Pagefreezer’s Legal Edition for Enterprise Collaboration facilitates legal investigations and early case assessments.

With a solution like Pagefreezer, legal teams can:

  • Add users and groups to the Pagefreezer dashboard and then instantly view a live replay of all content
  • Use advanced search to deliver relevant content across all archives, accounts, direct conversations, timelines, and groups
  • Instantly select relevant content, add comments, and export files to local servers for eDiscovery purposes
  • Export data to file formats such as PDF and WARC. Records are time-stamped and signed with a SHA-256 digital signature. All associated metadata is included in the export
  • Place users and data on legal hold to prevent the deletion of crucial evidence


Learn More

Modern online data sources introduce new challenges to online investigations, but they also provide opportunities. The fact of the matter is, no legal or investigative team can afford to ignore these channels and platforms. Evidence exists there, waiting to be discovered.

Ready to take your online investigations to the next level?

Book a demo with one of our solution advisors.

Schedule a Demo

Looking for advice and information?

Let us know and we’ll email you the info you need.

Get More Info

[email protected]

Head Office:
#500-311 Water Street
Vancouver, BC V6B 1B8

Europe Office:
Van Leeuwenhoekpark 1 - Office 5
2611 DW, Delft
The Netherlands

UK Office:
+44 20 3744 7173

Australia Office:
+61 (07) 3186 2199

Subscribe to our Blog

Get targeted Industry news, great tips and valuable insights

©2024 Pagefreezer Software Inc. All Rights Reserved. Privacy Policy and Acceptable Use Policy. Commercial use and distribution of the contents of this website is not allowed without express and prior written consent of Pagefreezer Software Inc. subject to existing copyright exceptions and limitations.