Thoughts on…

Java Middleware & Systems Management

Hibernate Many-To-Many Revisited

with 11 comments

The modeling problem is classic: you have two entities, say Users and Roles, which have a many-to-many relationship with one another. In other words, each user can be in multiple roles, and each role can have multiple users associated with it.

The schema is pretty standard and would look like:

CREATE TABLE app_user ( 
   id INTEGER,
   PRIMARY KEY ( id ) );

CREATE TABLE app_role (
   id INTEGER,
   PRIMARY KEY ( id ) );

CREATE TABLE app_user_role ( 
   user_id INTEGER,
   role_id INTEGER,
   PRIMARY KEY ( user_id, role_id ),
   FOREIGN KEY ( user_id ) REFERENCES app_user ( id ),
   FOREIGN KEY ( role_id ) REFERENCES app_role ( id ) );

But there are really two choices for how you want to expose this at the Hibernate / EJB3 layer. The first strategy employs the use of the @ManyToMany annotation:

@Entity 
@Table(name = "APP_USER")
public class User {
    @Id
    private Integer id;
    
    @ManyToMany
    @JoinTable(name = "APP_USER_ROLE", 
       joinColumns = { @JoinColumn(name = "USER_ID") }, 
       inverseJoinColumns = { @JoinColumn(name = "ROLE_ID") })
    private Set<Role> roles = new HashSet<Role>();
}

@Entity 
@Table(name = "APP_ROLE")
public class Role {
    @Id
    private Integer id;
    
    @ManyToMany(mappedBy = "roles")
    private Set<User> users = new HashSet<User>();
}

The second strategy uses a set of @ManyToOne mappings and requires the creation of a third “mapping” entity:

public class UserRolePK {
    @ManyToOne
    @JoinColumn(name = "USER_ID", referencedColumnName = "ID")
    private User user;

    @ManyToOne
    @JoinColumn(name = "ROLE_ID", referencedColumnName = "ID")
    private Role role;
}

@Entity @IdClass(UserRolePK.class) 
@Table(name = "APP_USER_ROLE")
public class UserRole {
    @Id
    private User user;

    @Id
    private Role role;
}

@Entity 
@Table(name = "APP_USER")
public class User {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "user")
    private Set<UserRole> userRoles;
}

@Entity 
@Table(name = "APP_ROLE")
public class Role {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "role")
    private Set<UserRole> userRoles;
}

The most obvious pro for the @ManyToMany solution is simpler data retrieval queries. The annotation automagically generates the proper SQL under the covers, and allows access to data from the other side of the linking table with a simple join at the HQL/JPQL level. For example, to get the roles for some user:

SELECT r 
FROM User u 
JOIN u.roles r 
WHERE u.id = :someUserId

You can still retrieve the same data with the other solution, but it’s not as elegant. It requires traversing from a user to the userRoles relationship, and then accessing the roles associated with those mapping entities:

SELECT ur.role 
FROM User u 
JOIN u.userRoles ur 
WHERE u.id = :someUserId

The inelegance of the second strategy becomes clear if you had several many-to-many relationships that you needed to traverse in a single query. If you had to use explicit mapping entities for each join table, the query would look like:

SELECT threeFour.four
FROM One one 
JOIN one.oneTwos oneTwo 
JOIN oneTwo.two.twoThrees twoThree 
JOIN twoThree.three.threeFours threeFour
where one.id = :someId

Whereas using @ManyToMany annotations, exclusively, would result in a query with the following form:

SELECT four 
FROM One one 
JOIN one.twos two 
JOIN two.threes three 
JOIN threes.four 
WHERE one.id = :someId

Some readers might wonder why, if we have explicit mapping table entities, we don’t just use them directly to make the query a little more intelligible / human-readable:

SELECT threeFour.four
FROM OneTwo oneTwo, TwoThree twoThree, ThreeFour threeFour
WHERE oneTwo.two = twoThree.two
AND twoThree.three = threeFour.three
AND oneTwo.one.id = :someId

Although I agree this query may be slightly easier to understand at a glance (especially if you’re used to writing native SQL), it definitely doesn’t save on keystrokes. Aside from that, it starts to pull away from thinking about your data model purely in terms of its high-level object relations.

In a read-mostly system, where access to data is the most frequent operation, it just makes sense to use the @ManyToMany mapping strategy. It achieves the goal while keeping the queries as simple and straight forward as possible.

However, elegance of select-statements should not be the only point considered when choosing a strategy. The more elaborate solution using the explicit mapping entiies does have its merits. Consider the problem of having to delete users that have properties matching a specific condition, which due to the foreign keys also require deleting user-role relationships matching that same criteria:

DELETE UserRole ur 
WHERE ur.user.id IN ( 
   SELECT u 
   FROM User u 
   WHERE u.someProperty = :someInterestingValue );
DELETE User u WHERE u.someProperty = :someInterestingValue;

If the mapping entity did not exist, the role objects would have to be loaded into the session, traversed one at a time, and have all of their users removed…after which, the role objects themselves could be deleted from the system. If your application only had a handful of users that matched this condition, either solution would probably perform just fine.

But what if you had tens of millions of users in your system, and this query happened to match 10% of them? (OK, perhaps this particular scenario is a bit contrived, but there *are* plenty of applications out there where the number of many-to-many relationships order in the tens of millions or more.) The logic would have to load more than a million users across the wire from the database which, as a result, might require you to implement a manual batching mechanism. You would load, say, 1000 users into memory at once, operate on them, flush/clear the session, then load the next batch, and so on. Memory requirements aside, you might find the transaction takes too long or might even time-out. In this case, you would need to execute each of the batches inside its own transaction, driving the process from outside of a transactional context.

Unfortunately, the data-load isn’t the only issue. The actual deletion work has problems too. You’re going to have to, for each user in turn, remove all of its roles (e.g., “user.getRoles().clear()”) and then delete the user itself (e.g., “entityManager.remove(user)”). These operations translate into two native SQL delete statements for each matched user – one to remove the related entries from the app_user_role table, and the other to remove the user itself from the app_user table).

All of these performance issues stem from the fact that a large amount of data has to be loaded across the wire and then manipulated, which results in a number of roundtrips proportional to the number of rows that match the criteria. However, by creating the mapping entity, it becomes possible to execute everything in two statements, neither of which even load data across the wire.

So what’s the right solution? Well, the interesting thing about this problem space is that the two solutions described above are not mutually exclusive. There’s nothing that prevents you from using both of them simultaneously:

public class UserRolePK {
    @ManyToOne
    @JoinColumn(name = "USER_ID", referencedColumnName = "ID")
    private User user;

    @ManyToOne
    @JoinColumn(name = "ROLE_ID", referencedColumnName = "ID")
    private Role role;
}

@Entity @IdClass(UserRolePK.class) 
@Table(name = "APP_USER_ROLE")
public class UserRole {
    @Id
    private User user;

    @Id
    private Role role;
}

@Entity 
@Table(name = "APP_USER")
public class User {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "user")
    private Set<UserRole> userRoles;
    
    @ManyToMany
    @JoinTable(name = "APP_USER_ROLE", 
       joinColumns = { @JoinColumn(name = "USER_ID") }, 
       inverseJoinColumns = { @JoinColumn(name = "ROLE_ID") })
    private Set<Role> roles = new HashSet<Role>();
}

@Entity 
@Table(name = "APP_ROLE")
public class Role {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "role")
    private Set<UserRole> userRoles;
    
    @ManyToMany(mappedBy = "roles")
    private Set<User> users = new HashSet<User>();
}

This hybrid solution actually gives you the best of both worlds: elegant queries and efficient updates to the linking table. Granted, the boilerplate to set up all the mappings might seem tedious, but that extra effort is well worth the pay-off.

About these ads

Written by josephmarques

February 22, 2010 at 12:52 pm

Posted in hibernate

Tagged with

11 Responses

Subscribe to comments with RSS.

  1. This is a great example!
    Just to clean up the readability could you assign the generic types to the sets?

    Srgjan Srepfler

    February 24, 2010 at 10:24 pm

    • Done. Thanks for the feedback.

      josephmarques

      February 28, 2010 at 9:26 pm

  2. Great article ! I hoped it would go as far as explaining how to use the Criteria API to have an INNER JOIN statement.

    Indeed, I’m having trouble with such a design. I cannot find a way to fold and cap it with a Criteria.

    I posted on the Hibernate forum at: https://forum.hibernate.org/viewtopic.php?p=2439493#p2439493

    If you feel like giving it a look.

    Thanks for your clear writing by the way !

    Stephane

    December 16, 2010 at 11:28 am

  3. [...] [Thoughts On: Hibernate Many-To-Many Revisited] – They guy has some other thoughts about how to achieve many-to-many mappings and handle for PK. He somewhat agrees with me, but offers a hybrid approach. Share this:TwitterEmailDiggLinkedInRedditFacebookPrintLike this:LikeBe the first to like this post. Posted in Java [...]

  4. thanh for your post. But, how can I add more conditions.

    EX: I have 2 table student and cerfiticate which is many-to-many relationship. How can I list all studen have name like ‘%an%’ and have cerfiticate A, B, C

    dragonfly6373

    November 28, 2012 at 10:32 pm

    • Something like this maybe?

      SELECT s FROM Student s JOIN s.studentCertificates sCerts WHERE LOWER(s.name) LIKE ‘%an%’ AND sCerts.id IN (A, B, C)

      On Wed, Nov 28, 2012 at 9:32 PM, Thoughts on…

      josephmarques

      November 30, 2012 at 11:40 am

  5. Good web site you have here.. It’s difficult to find good quality writing like yours these days. I honestly appreciate individuals like you! Take care!!

  6. It’s awesome to visit this web page and reading the views of all colleagues on the topic of this paragraph, while I am also eager of getting knowledge.

    k-brandshop.com

    May 15, 2013 at 11:07 am

  7. it would be nice if you had given testcases of all operations such as create update delete

    praveen

    December 13, 2013 at 10:10 am

  8. Disclaimer: I’m new to hibernate so bear with me :D

    In the hybrid solution where you have:

    @ManyToMany
    @JoinTable(name = “APP_USER_ROLE”,
    joinColumns = { @JoinColumn(name = “USER_ID”) },
    inverseJoinColumns = { @JoinColumn(name = “ROLE_ID”) })
    private Set roles = new HashSet();

    Does this mean that every role stores a Set of full User objects for all users assigned to that role? I.e. If I load this role somewhere, is it loading every user assigned to that role into memory?

    If so, will this not cause performance problems in large applications? I’m used to trying to store lists/sets of other object’s IDs only within my objects rather than trying to store the full other objects themselves due to fear of performance problems later. E.g. here it would be a set of role IDs only referencing all roles for this user. Then If I need details from the role I have the ID so can look it up then. Perhaps I’m worrying too much here or on the wrong page entirely, but its this mindset that is confusing me a little now while trying to learn hibernate (via spring data jpa).

    I think the bit I’m confused about is: “However, by creating the mapping entity, it becomes possible to execute everything in two statements, neither of which even load data across the wire.” – could you elaborate on this?

    Thanks,
    H

    H

    July 14, 2014 at 10:03 am

  9. You share interesting things here. I think that your page can go viral
    easily, but you must give it initial boost and i know how to do it, just type in google (with quotes) for – “mundillo traffic increase make your website go viral”

    Delores

    October 9, 2014 at 5:39 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: