cachevilla.blogg.se

Twitter archive
Twitter archive













twitter archive

However, using user agents such as Sogou Spider, Exabot, ia_archiver (Alexa crawler), and Dotbot, SPN failed to capture the mementos properly. $ curl -iLs -A " Mozilla/5.0 (compatible AhrefsBot/5.2 +) " /save/

#Twitter archive archive

Out of those user agents, we found that when the UA was set as Bingbot, Slurp, DuckDuckBot, Baiduspider, YandexBot, or Facebot, we were able to archive Twitter pages as expected. We further tested using the top 10 most popular web crawlers and user agents. We tested with user agents such as Bot, Twitterbot, and AhrefsBot which resulted in successful captures. There are some UA with the word "bot" in the string that works, but not all of them do. It looked like “Googlebot” is not the only UA that worked. This motivated us to dig deeper into how SPN responds to different user agents. The Effect of UI Change on Different Web Archives The new interface mostly communicates through . We have learned that the underlying architecture of the new is focused on responsive web design and is built to serve both mobile and desktop users. Based on our analysis, we noticed that the old UI only communicated with the API for authentication and configuration of embedded video. As you can see in Figures 5 and 6, the old Twitter UI is sending “” request through and the new UI is making a similar type of request but through. We were able to verify that the requests generated by the legacy interface and the new interface are different in many ways. We used Haralyzer to analyze the differences between the two UIs (the data is available in a GitHub repo). While loading both versions of the page, we captured the network traffic for a duration of two minutes using the Chrome developer tools and exported the data in HTTP Archive (HAR) format. Currently, most of the web archiving services are unable to successfully archive the new UI. This difference in interface is also the cause of the missing disclaimer. The disclaimer and the label are likely features of Twitter’s new UI. The memento of the Minneapolis protests tweet has the old Twitter interface, while the live tweet itself has the new Twitter interface. The other visible difference shown in the above image is the variation in Twitter UI. Figure 1 shows how the disclaimer is missing in the archived copy in the Internet Archive's Wayback Machine. This tweet thread shows how different archiving platforms failed to replay Twitter's disclaimer. On replaying these tweets through web archives, most of the archived pages, or mementos, failed to include the disclaimer and the label. A similar case happened on May 26, 2020, when Twitter added a fact-check label to two of the President’s tweets on mail-in voting. This disclaimer labeled the tweet as violating Twitter Rules about glorifying violence. President Donald Trump’s tweets on the Minneapolis protests. On May 29, 2020, Twitter attached a disclaimer to one of U.S.















Twitter archive