I think this is a current list, if anyone wants to block access by user agent:
FacebookExternalHitThe primary purpose of FacebookExternalHit is to crawl the content of an app or website that was shared on one of Meta’s family of apps, such as Facebook, Instagram, or Messenger.
Note that the FacebookExternalHit crawler might bypass robots.txt when performing security or integrity checks, such as checking for malware or malicious content.
Meta-ExternalAgentThe Meta-ExternalAgent crawler crawls the web for use cases such as training AI models or improving products by indexing content directly.
Meta-ExternalFetcherThe Meta-ExternalFetcher crawler performs user-initiated fetches of individual links to support specific product functions. Because the fetch was initiated by a user, this crawler may bypass robots.txt rules.
FacebookBotFacebookBot crawls public web pages to improve language models for our speech recognition technology.
Sources:
https://developers.facebook.com/docs/sharing/webmasters/web-crawlers/https://developers.facebook.com/docs/sharing/bot/#
Facebook #
Meta #
WebCrawling