Skip to content
Advertisement

PHP fetch all Twitter Followers and compare them to friends

I am looking for scalable way to do the following:

  • User login
  • Fetch all Friends from Twitter
  • Fetch all Followers from Twitter
  • Display all Friends which aren’t Followers

The Problem: How can this be done in a scalable way? An user can have up to 2 million friends or followers. Currently I’m storing both inside an SQLite table and compare them through a loop. When the user comes back the table is cleared and process starts again.

This works fine on 100 – 1000 Friends, but will be tricky with 500000 Friends. I can’t cache the lists because they can change every moment.

Does anyone know a good way to handle such big amount of data?

Advertisement

Answer

I don’t know what your database looks like, but this is how I would set it up.

CREATE TABLE twitter_users (
    user_id INTEGER PRIMARY KEY NOT NULL,
    screen_name VARCHAR(20) NOT NULL
);

CREATE TABLE friends (
    friend_id INTEGER PRIMARY KEY NOT NULL
);

CREATE TABLE followers (
    follower_id INTEGER PRIMARY KEY NOT NULL
);

Then you can use this SQL to get the friends who are not followers.

SELECT friend_id, screen_name
FROM friends
LEFT JOIN followers ON follower_id = friend_id
LEFT JOIN twitter_users ON user_id = friend_id
WHERE follower_id IS NULL

If the screen name is NULL it means they are not in your twitter_users table. You can look up the missing users and store them for later. Screen names can change so you might need to update the table periodically.

Use the friends/ids and followers/ids APIs to get a list of friend and follower ids 5,000 at a time. Use the users/lookup API to get up to 100 screen names. If a user has 2,000,000 friends it will take 400 api calls to get the list of ids so you should still cache the list at least for popular users.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement