그래프 알고리즘으로 친구 추천하기 Friend Recommendations with Graph Algorithms

4 분 소요

Friend Recommendations with Graph Algorithms

Friend recommendations keep users engaged by suggesting relevant connections. Graph databases excel at this—the data structure naturally encodes the signals needed for recommendations: mutual friends, shared interests, and network proximity.

1. Recommendation Signals

Effective friend recommendations combine multiple signals:

Mutual friends: More mutual connections = higher likelihood of knowing each other
Friends of friends: 2nd-degree connections are natural candidates
Shared interests: Common tags, groups, or activity patterns
Network proximity: Closer in the graph = more relevant

Each signal can be weighted based on its predictive value.

2. Mutual Friends Algorithm

The simplest and most effective signal—people with mutual friends often know each other:

@Query("""
    MATCH (me:User {username: $username})-[:FRIENDS_WITH]-(friend:User)
          -[:FRIENDS_WITH]-(candidate:User)
    WHERE candidate <> me
      AND NOT (me)-[:FRIENDS_WITH]-(candidate)
      AND NOT (me)-[:BLOCKED]-(candidate)
    WITH candidate, count(DISTINCT friend) as mutualCount
    WHERE mutualCount >= $minMutual
    RETURN candidate, mutualCount
    ORDER BY mutualCount DESC
    LIMIT $limit
    """)
List<FriendRecommendation> recommendByMutualFriends(String username, int minMutual, int limit);

The count(DISTINCT friend) counts how many paths exist between me and candidate through friends—this equals the mutual friend count.

3. Weighted Multi-Signal Recommendations

Combine signals with weights for better recommendations:

@Query("""
    MATCH (me:User {username: $username})

    // Find candidates (friends of friends)
    OPTIONAL MATCH (me)-[:FRIENDS_WITH]-(friend)-[:FRIENDS_WITH]-(candidate)
    WHERE candidate <> me AND NOT (me)-[:FRIENDS_WITH]-(candidate)

    WITH me, candidate, count(DISTINCT friend) as mutualFriends
    WHERE candidate IS NOT NULL

    // Shared interests signal
    OPTIONAL MATCH (me)-[:INTERESTED_IN]->(interest)<-[:INTERESTED_IN]-(candidate)
    WITH me, candidate, mutualFriends, count(DISTINCT interest) as sharedInterests

    // Same location signal
    OPTIONAL MATCH (me)-[:LIVES_IN]->(loc)<-[:LIVES_IN]-(candidate)
    WITH candidate, mutualFriends, sharedInterests,
         CASE WHEN loc IS NOT NULL THEN 1 ELSE 0 END as sameLocation

    // Calculate weighted score
    WITH candidate,
         mutualFriends,
         sharedInterests,
         sameLocation,
         (mutualFriends * 3) + (sharedInterests * 2) + (sameLocation * 1) as score

    WHERE score > 0
    RETURN candidate.username as username,
           candidate.displayName as displayName,
           mutualFriends,
           sharedInterests,
           sameLocation,
           score
    ORDER BY score DESC
    LIMIT $limit
    """)
List<ScoredRecommendation> getRecommendationsWithScore(String username, int limit);

The weights (3, 2, 1) prioritize mutual friends over shared interests over location. Adjust based on what works for your user base.

4. Recommendation Service

@Service
public class RecommendationService {

    private final UserRepository userRepository;

    public RecommendationService(UserRepository userRepository) {
        this.userRepository = userRepository;
    }

    public List<ScoredRecommendation> getFriendRecommendations(String username, int limit) {
        // Get weighted recommendations
        List<ScoredRecommendation> recommendations =
            userRepository.getRecommendationsWithScore(username, limit * 2);

        // Filter out recently rejected suggestions (optional)
        Set<String> recentlyRejected = getRecentlyRejected(username);
        recommendations = recommendations.stream()
            .filter(r -> !recentlyRejected.contains(r.username()))
            .limit(limit)
            .toList();

        return recommendations;
    }

    public List<User> getPeopleYouMayKnow(String username, int limit) {
        // Simpler version for quick suggestions
        return userRepository.recommendByMutualFriends(username, 2, limit);
    }
}

5. Jaccard Similarity

For more sophisticated matching, calculate Jaccard similarity—the intersection over union of friend sets:

@Query("""
    MATCH (me:User {username: $username})-[:FRIENDS_WITH]-(myFriend:User)
    WITH me, collect(myFriend) as myFriends

    MATCH (candidate:User)-[:FRIENDS_WITH]-(theirFriend:User)
    WHERE candidate <> me
      AND NOT (me)-[:FRIENDS_WITH]-(candidate)
      AND NOT (me)-[:BLOCKED]-(candidate)
    WITH me, myFriends, candidate, collect(theirFriend) as theirFriends

    // Calculate intersection
    WITH candidate, myFriends, theirFriends,
         [f IN myFriends WHERE f IN theirFriends] as intersection

    // Calculate union (myFriends + theirFriends not in myFriends)
    WITH candidate,
         size(intersection) as intersectionSize,
         size(myFriends) + size([f IN theirFriends WHERE NOT f IN myFriends]) as unionSize

    WHERE unionSize > 0
    WITH candidate,
         intersectionSize,
         toFloat(intersectionSize) / unionSize as jaccardSimilarity

    WHERE jaccardSimilarity > $threshold
    RETURN candidate, jaccardSimilarity, intersectionSize as mutualFriends
    ORDER BY jaccardSimilarity DESC
    LIMIT $limit
    """)
List<SimilarityResult> findSimilarUsers(String username, double threshold, int limit);

Jaccard similarity ranges from 0 to 1. A threshold of 0.1-0.2 usually filters noise while keeping relevant suggestions.

6. Cold Start Problem

New users have no friends, so friend-based recommendations fail. Handle this with fallback strategies:

public List<User> getRecommendationsForNewUser(String username, int limit) {
    User user = userRepository.findByUsername(username)
        .orElseThrow(() -> new UserNotFoundException(username));

    // Strategy 1: Same location
    if (user.getLocation() != null) {
        List<User> localUsers = userRepository.findPopularUsersInLocation(
            user.getLocation().getId(), limit);
        if (!localUsers.isEmpty()) {
            return localUsers;
        }
    }

    // Strategy 2: Shared interests
    if (!user.getInterests().isEmpty()) {
        List<User> interestMatch = userRepository.findUsersBySharedInterests(
            username, limit);
        if (!interestMatch.isEmpty()) {
            return interestMatch;
        }
    }

    // Strategy 3: Popular/active users
    return userRepository.findMostActiveUsers(limit);
}

@Query("""
    MATCH (u:User)-[:LIVES_IN]->(:Location {id: $locationId})
    WHERE u.status = 'ACTIVE'
    OPTIONAL MATCH (u)<-[:FOLLOWS]-(follower)
    WITH u, count(follower) as followerCount
    ORDER BY followerCount DESC
    LIMIT $limit
    RETURN u
    """)
List<User> findPopularUsersInLocation(Long locationId, int limit);

@Query("""
    MATCH (me:User {username: $username})-[:INTERESTED_IN]->(interest)
          <-[:INTERESTED_IN]-(other:User)
    WHERE other <> me AND NOT (me)-[:FRIENDS_WITH]-(other)
    WITH other, count(interest) as sharedCount
    ORDER BY sharedCount DESC
    LIMIT $limit
    RETURN other
    """)
List<User> findUsersBySharedInterests(String username, int limit);

7. Neo4j Graph Data Science (Optional)

For large-scale recommendations, Neo4j’s Graph Data Science library provides optimized algorithms:

// First, project the graph
@Query("""
    CALL gds.graph.project(
        'social-network',
        'User',
        {
            FRIENDS_WITH: {orientation: 'UNDIRECTED'},
            FOLLOWS: {orientation: 'NATURAL'}
        }
    )
    """)
void projectGraph();

// Node similarity using GDS
@Query("""
    CALL gds.nodeSimilarity.stream('social-network', {
        topK: $topK,
        similarityCutoff: $cutoff
    })
    YIELD node1, node2, similarity
    WHERE gds.util.asNode(node1).username = $username
    RETURN gds.util.asNode(node2) as similarUser, similarity
    ORDER BY similarity DESC
    """)
List<SimilarityResult> findSimilarUsersGDS(String username, int topK, double cutoff);

GDS computes similarity more efficiently for large graphs by using optimized algorithms and parallel processing.

8. REST Controller

@RestController
@RequestMapping("/api/recommendations")
public class RecommendationController {

    private final RecommendationService recommendationService;

    @GetMapping("/friends")
    public ResponseEntity<List<ScoredRecommendation>> getFriendRecommendations(
            @AuthenticationPrincipal UserDetails user,
            @RequestParam(defaultValue = "10") int limit) {
        return ResponseEntity.ok(
            recommendationService.getFriendRecommendations(user.getUsername(), limit));
    }

    @GetMapping("/people-you-may-know")
    public ResponseEntity<List<UserDto>> getPeopleYouMayKnow(
            @AuthenticationPrincipal UserDetails user,
            @RequestParam(defaultValue = "5") int limit) {
        List<User> suggestions = recommendationService.getPeopleYouMayKnow(
            user.getUsername(), limit);
        return ResponseEntity.ok(suggestions.stream().map(UserDto::from).toList());
    }

    @PostMapping("/dismiss/{username}")
    public ResponseEntity<Void> dismissRecommendation(
            @AuthenticationPrincipal UserDetails user,
            @PathVariable String username) {
        recommendationService.dismissRecommendation(user.getUsername(), username);
        return ResponseEntity.ok().build();
    }
}

9. Caching Recommendations

Recommendations are expensive to compute. Cache them:

@Service
public class CachedRecommendationService {

    private final RecommendationService recommendationService;
    private final Cache<String, List<ScoredRecommendation>> cache;

    public CachedRecommendationService(RecommendationService recommendationService) {
        this.recommendationService = recommendationService;
        this.cache = Caffeine.newBuilder()
            .expireAfterWrite(Duration.ofHours(1))
            .maximumSize(10_000)
            .build();
    }

    public List<ScoredRecommendation> getRecommendations(String username, int limit) {
        return cache.get(username, key ->
            recommendationService.getFriendRecommendations(key, limit));
    }

    public void invalidate(String username) {
        cache.invalidate(username);
    }
}

Invalidate the cache when a user adds friends, updates interests, or changes location.

10. Conclusion

Graph-based friend recommendations leverage the network structure that already encodes social signals. Mutual friends remain the strongest predictor, but combining multiple signals improves relevance. For new users, fall back to location, interests, or popularity. At scale, Neo4j’s Graph Data Science library provides optimized similarity algorithms. This concludes the social network series—from basic setup through schema design, relationships, feeds, and recommendations.

Twitter Facebook LinkedIn

울이

그래프 알고리즘으로 친구 추천하기 Friend Recommendations with Graph Algorithms

Friend Recommendations with Graph Algorithms

1. Recommendation Signals

2. Mutual Friends Algorithm

3. Weighted Multi-Signal Recommendations

4. Recommendation Service

5. Jaccard Similarity

6. Cold Start Problem

7. Neo4j Graph Data Science (Optional)

8. REST Controller

9. Caching Recommendations

10. Conclusion

공유하기

댓글남기기

참고

백엔드 개발자를 위한 엣지 컴퓨팅 패턴 Edge Computing Patterns for Backend Developers

백엔드 개발자를 위한 엣지 컴퓨팅 패턴 Edge Computing Patterns for Backend Developers

클라우드 네이티브 Observability Part 5 - Observability 데이터로 프로덕션 이슈 디버깅 Cloud-Native Observability Stack Part 5 - Debugging Production Issues with Observability Data

클라우드 네이티브 Observability Part 5 - Observability 데이터로 프로덕션 이슈 디버깅 Cloud-Native Observability Stack Part 5 - Debugging Production Issues with Observability Data

클라우드 네이티브 Observability Part 4 - Prometheus/Grafana로 메트릭과 알림 Cloud-Native Observability Stack Part 4 - Metrics and Alerting with Prometheus/Grafana

클라우드 네이티브 Observability Part 4 - Prometheus/Grafana로 메트릭과 알림 Cloud-Native Observability Stack Part 4 - Metrics and Alerting with Prometheus/Grafana

클라우드 네이티브 Observability Part 3 - 구조화된 로깅과 Correlation ID Cloud-Native Observability Stack Part 3 - Structured Logging with Correlation IDs

Cloud-Native Observability Stack Part 3 - Structured Logging with Correlation IDs