Clickbait Killer – An App I Wrote to Remove Clickbait Spam from Facebook

If you have had a Facebook account for the past year or so, you couldn’t have missed the phenomenon that is clickbait. I was so annoyed by this that I wrote a small Chrome extension named “Clickbait Killer” that filtered out such garbage from my feed so I wouldn’t have to deal with it. I’ve released it so that you too can use it, should you so choose. You can find out more information on the Clickbait Killer page.

But what is clickbait, and why is it annoying? Clickbait is the use of hyperbolic and sensationalist content that lures (the “bait” part) users into clicking (the “click” part) to see more. The owners of the visited sites just want to rake in revenue from advertising placed on their pages. Actual content plays second fiddle to phrases that have been algorithmically shown to generate the most clicks.

One popular form of clickbait is the “X Reasons Why” list. As an example, I typed “14 Reasons Why” into a Google search, and got the following article as a result: 14 Scientific Reasons Why Bacon Is Really F*cking Good For You Notice the use of words like “Scientific” “Really” and “F*cking”. These are all attempts to get you to click on the article, regardless of what it contains. The first few paragraphs, if not the whole article, are generally very low content. The authors know how search engines work, and place certain keywords in the hopes that their article rises the Google search engine ranks. If someone searches for “is bacon good for me?”, for example, this anything-but-scientific article may pop up since it has all of the right words. That increases the chances the user clicks on this article, which in turn generates more money for the business.

Is there anything more satisfying, alluring or mouth-watering than bacon? A sizzling pan of bacon brightens the cloudiest of mornings; it’s the golden ticket to a perfect day. Everything good starts (and, realistically, ends) with bacon. – What is the actual content here?

Another morally worrying aspect of clickbait is that the linked-to articles are often content that is simply relinked from elsewhere, which may have been itself relinked. Check out this example I saw today on my Facebook feed (also, notice the hyperbolic domain “thisblewmymind.com” – a sign that you are in for some clickbait): http://www.thisblewmymind.com/passengers-on-plane-whip-out-their-phones-the-minute-these-elderly-men-do-this/ This site hired someone to find an already viral video, add some intro text that helps them increase their numbers, and repackage the content as their own.

At least in this case the original content owner got views on their Youtube page. In many other cases content is taken with no attribution back to the original author.
An excellent New Yorker article, that I highly recommend (if you can stomach it), tells the sordid tale of a chain of content stealing:

At the bottom of a Dose post, there is usually a small “hat tip” (abbreviated as “H/T”). Many people don’t notice this citation, if they even reach the bottom of the post. On Dose’s first day of existence, its most successful list was called “23 Photos of People from All Over the World Next to How Much Food They Eat Per Day.” It was a clever illustration of global diversity and inequity: an American truck driver holding a tray of cheeseburgers and Starbucks Frappuccinos; a Maasai woman posing with eight hundred calories’ worth of milk and porridge. Beneath the final photograph, a line of tiny gray text read “H/T Elite Daily.” It linked to a post that Elite Daily, a Web site based in New York, had published a month earlier (“See the Incredible Differences in the Daily Food Intake of People Around the World”). That post, in turn, had linked to UrbanTimes (“80 People, 30 Countries and How Much They Eat on a Daily Basis”), which had credited Amusing Planet (“What People Eat Around the World”), which had cited a 2010 radio interview with Faith D’Aluisio and Peter Menzel, the writer and the photographer behind the project.

The article goes on to mention that the actual content creators invested 1 million dollars and 4 years of their lives creating this portfolio of images and are now trying to sell books and license their images in an attempt to recoup some of the money. Instead, the money for the content goes to the chain of clickbait sites who have taken the images illegally and immorally.

As I have mentioned before, it seems like calculated advertising is replacing content in more and more areas of life, and I find it very troublesome. Clickbait is a clear example of this and perhaps its highest incarnation. Instead of focusing on creating content that people enjoy and find meaningful, these companies use math and psychology to maximize revenue, often at the expense of actual content creators and disappointed users.

As always, I am interested to hear your ideas on this topic. As a reminder, you can download my app on the Clickbait Killer Project Page or download the source code on my Github page.

Calculated Experience

Years ago, I was hanging out with a group of friends, one of which brought up a joke he had recently seen online. Although the content of the joke would probably make this, the second sentence of this post, much more interesting, I have to say I can’t remember it. And the joke isn’t really the point. The point was the almost everyone in the group had already seen the exact same joke online. The joke was posted on Reddit, a popular news aggregation site where people can up- and downvote issues as they see fit. I remember thinking to myself (and I think saying out loud), that it was incredible that among the millions of jokes that are posted every single day online, this group of people had all seen the exact same one. This post you are reading has been in the back of my head ever since.

In more and more areas of life, at least it seems to me, experiences are being quantified according to a formula and then spit back out to users, sorted accordingly. I say “experiences of life” because I can’t think of a better phrase that accounts for the broadness of such disparate items as “knowledge”, “current events”, “music”, and “film”, just to name a few. Instead of a user having to make a conscious decision as to what they want to experience online, the answer is just given to them. Maybe some examples will clear up what I mean.

  • You go to Reddit.com and naturally start at the top of the page – this is where the most highly rated items of the day are. Since they were the most highly rated by other users, chances are you will find the item good as well. You read the description, decide it is good enough, and click the link. Now you see the most highly rated comment of the thread and view comments in this order until you get tired of the thread.
  • You are having a party and use Spotify, a music application. You type in “party music” and see already created playlists. You click the first or second playlist and your party is ready to go.
  • You are on Netflix, a movie streaming application, and you want to watch a new movie. Movies are presented to you based on a complicated rating system, including feedback based on what you have watched and liked in the past. You pick one that is a relatively good match.
  • You log into Facebook and have the default “Top Stories” mode selected, in which you see stories presented in an order based on Facebook’s algorithm for “top”. From Facebook: “, it uses factors such as how many friends are commenting on a post to aggregate content that you’ll find interesting. It displays stories based on their relevance, rather than in chronological order.

This list could obviously go on and on. It goes without saying that these services provide benefits to society, but I think there are some troubling aspects that one could raise about such systems and how they could affect society as a whole:

  • More and more people access the exact same information from the same sources. At a micro level you are probably going to find information that you find interesting. At a macro level, the chances are slimmer that you will come across someone with a different viewpoint than you. You have been consuming the same information as others, so the exchange of information between two parties will be lower. Instead of everyone being able to contribute a unique, nuanced perspective on a complicated issue, you are more likely to hear just a couple of points, and likely ones you have already heard and ones you may have given yourself.
  • The information people know will be highly stratified. When you do encounter someone who has a different opinion than you, then chances of a meaningful discussion being possible are lower. You subscribe to “RightWingNews.com” on Facebook and you and thousands of other subscribers comment on the posts, all confirming ideas you all thought yesterday. You encounter someone who subscribed to “LeftWingNews.com” who did the same. You both think the other side is just saying gibberish. How could this not be the case? The information you have allowed yourself to consume is highly stratified and never challenges you to think in a different way from the exact way you already think, which you are already sure is the right way (If you are stuck inside of a system, how could it be the wrong way?).
  • There is a certain loss of agency in giving up the choice to make a conscious decision. If an algorithm is deciding for you, you aren’t deciding. When I was younger, going to get a new CD was a big, fun decision and after the purchase I listened to each song on the CD over and over. I don’t do that anymore. If I hear any sort of self-generated mix it is usually the best hit from each of the best artists in a particular genre. I don’t hear the other songs from the artist that aren’t the “best” and I don’t hear from the “non-best” bands. Music touches me less directly. I think it is a shame. Imagine an ice cream flavor machine choosing your flavor for you at the store. It determined that most people that day liked chocolate and so everyone, including you, gets chocolate. It tastes pretty good. You eat it and you go home.

What can we do to combat problems like this? Improvise – do things you don’t normally do. Read a newspaper from a publisher that you have never read before. Read a site that has the exact opposite view on an issue you have an opinion about. Go into a bookstore and buy a book you didn’t read an Amazon review for on a topic that you think is interesting but have never explored. Ask someone who you don’t usually talk to about music what they have been listening to lately. Hit the “Random article” button in Wikipedia and follow the links down the rabbit hole. Tell me other ideas you have!

 

 

Computer Crash!

Unfortunately, the server hosting this website crashed last week and the last backup that I ran only had information through December. In other words, the posts that I made about Germany have been lost.

I’ve put up a new server hosted by Amazon Web Services – I am always impressed by how easy it is to setup one of these things. I put up an Ubuntu 12.04 instance, copied over my WordPress backup, reinstalled WordPress and then put up a new instance of Redmine, which I use for project management. All of this took just a few hours.

Anyway, the new server should take the problem of crashing out of my hands – Amazon backs everything up so I shouldn’t have another problem like this again.

Linux Virtual Box Configuration

This one is also just for me (aren’t the all?).

I often make Linux Virtual machines that run in Virtual Box and tinker with the settings to get the environment working out. I just did some benchmarks with different settings and I wanted to capture the optimal configuration so I don’t have to do these tests each time. They are:

 

On my Dell 15z with Hardinfo benchmarker

– Linux Mint 15 (out performed Ubuntu 13.10 on all but N-Queens, in which Ubuntu won by a large margin)

– Memory 4096 (although I think this can be changed)

– Processors 4 (3 works well too, and doesn’t spin so hot)

– PAE/NX Enabled

– VT-x/AMD-V Enabled

– Nested Paging DISABLED

– Display: Max Video Memory

– Display: 3d Acceleration ticked

Reformatting the 15z

This is just a note on reformatting my 15z, since I ran into some problems and will definitely forget how to fix them by the time I reformat again.

Problem: Wireless was really laggy when I reformatted.

Solution: First, get the older version of the 6230 wireless driver (v14.3.0.6) from the Dell site. After it install, do the following: “There’s a parameter in the registry that will prevent the adapter from scanning while associated to an AP. It’s named ScanWhenAssociated. Default value is 1. Set it to 0 to prevent scanning. The negative outcome for this action is that roaming may be affected. When it comes time to roam, the WiFi adapter might not have an up to date scan list. To find the parameter in the registry you can do a search under CurrentControlSet.” as seen here. This fixes a weird intermittent lag issue.

 

Problem: Can’t get Webcam Software

Solution: I had to go to Dell’s technical chat to get the software manually. Apparently I can’t get to the download center since my computer is refurbished! The in chat technician gave me links to the Webcam software.

 

Problem:  Audio Snap, Crackles, and Pops

Solution: First, get the audio driver from the Dell website. Next, read here. Basically, get the Qualcomm driver for the Ethernet card.

 

That should be it – make sure that you have all of the other drivers as well!