Thoughts on…

Java Middleware & Systems Management

Hibernate Many-To-Many Revisited

with 12 comments

The modeling problem is classic: you have two entities, say Users and Roles, which have a many-to-many relationship with one another. In other words, each user can be in multiple roles, and each role can have multiple users associated with it.

The schema is pretty standard and would look like:

CREATE TABLE app_user ( 
   id INTEGER,
   PRIMARY KEY ( id ) );

CREATE TABLE app_role (
   id INTEGER,
   PRIMARY KEY ( id ) );

CREATE TABLE app_user_role ( 
   user_id INTEGER,
   role_id INTEGER,
   PRIMARY KEY ( user_id, role_id ),
   FOREIGN KEY ( user_id ) REFERENCES app_user ( id ),
   FOREIGN KEY ( role_id ) REFERENCES app_role ( id ) );

But there are really two choices for how you want to expose this at the Hibernate / EJB3 layer. The first strategy employs the use of the @ManyToMany annotation:

@Entity 
@Table(name = "APP_USER")
public class User {
    @Id
    private Integer id;
    
    @ManyToMany
    @JoinTable(name = "APP_USER_ROLE", 
       joinColumns = { @JoinColumn(name = "USER_ID") }, 
       inverseJoinColumns = { @JoinColumn(name = "ROLE_ID") })
    private Set<Role> roles = new HashSet<Role>();
}

@Entity 
@Table(name = "APP_ROLE")
public class Role {
    @Id
    private Integer id;
    
    @ManyToMany(mappedBy = "roles")
    private Set<User> users = new HashSet<User>();
}

The second strategy uses a set of @ManyToOne mappings and requires the creation of a third “mapping” entity:

public class UserRolePK {
    @ManyToOne
    @JoinColumn(name = "USER_ID", referencedColumnName = "ID")
    private User user;

    @ManyToOne
    @JoinColumn(name = "ROLE_ID", referencedColumnName = "ID")
    private Role role;
}

@Entity @IdClass(UserRolePK.class) 
@Table(name = "APP_USER_ROLE")
public class UserRole {
    @Id
    private User user;

    @Id
    private Role role;
}

@Entity 
@Table(name = "APP_USER")
public class User {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "user")
    private Set<UserRole> userRoles;
}

@Entity 
@Table(name = "APP_ROLE")
public class Role {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "role")
    private Set<UserRole> userRoles;
}

The most obvious pro for the @ManyToMany solution is simpler data retrieval queries. The annotation automagically generates the proper SQL under the covers, and allows access to data from the other side of the linking table with a simple join at the HQL/JPQL level. For example, to get the roles for some user:

SELECT r 
FROM User u 
JOIN u.roles r 
WHERE u.id = :someUserId

You can still retrieve the same data with the other solution, but it’s not as elegant. It requires traversing from a user to the userRoles relationship, and then accessing the roles associated with those mapping entities:

SELECT ur.role 
FROM User u 
JOIN u.userRoles ur 
WHERE u.id = :someUserId

The inelegance of the second strategy becomes clear if you had several many-to-many relationships that you needed to traverse in a single query. If you had to use explicit mapping entities for each join table, the query would look like:

SELECT threeFour.four
FROM One one 
JOIN one.oneTwos oneTwo 
JOIN oneTwo.two.twoThrees twoThree 
JOIN twoThree.three.threeFours threeFour
where one.id = :someId

Whereas using @ManyToMany annotations, exclusively, would result in a query with the following form:

SELECT four 
FROM One one 
JOIN one.twos two 
JOIN two.threes three 
JOIN threes.four 
WHERE one.id = :someId

Some readers might wonder why, if we have explicit mapping table entities, we don’t just use them directly to make the query a little more intelligible / human-readable:

SELECT threeFour.four
FROM OneTwo oneTwo, TwoThree twoThree, ThreeFour threeFour
WHERE oneTwo.two = twoThree.two
AND twoThree.three = threeFour.three
AND oneTwo.one.id = :someId

Although I agree this query may be slightly easier to understand at a glance (especially if you’re used to writing native SQL), it definitely doesn’t save on keystrokes. Aside from that, it starts to pull away from thinking about your data model purely in terms of its high-level object relations.

In a read-mostly system, where access to data is the most frequent operation, it just makes sense to use the @ManyToMany mapping strategy. It achieves the goal while keeping the queries as simple and straight forward as possible.

However, elegance of select-statements should not be the only point considered when choosing a strategy. The more elaborate solution using the explicit mapping entiies does have its merits. Consider the problem of having to delete users that have properties matching a specific condition, which due to the foreign keys also require deleting user-role relationships matching that same criteria:

DELETE UserRole ur 
WHERE ur.user.id IN ( 
   SELECT u 
   FROM User u 
   WHERE u.someProperty = :someInterestingValue );
DELETE User u WHERE u.someProperty = :someInterestingValue;

If the mapping entity did not exist, the role objects would have to be loaded into the session, traversed one at a time, and have all of their users removed…after which, the role objects themselves could be deleted from the system. If your application only had a handful of users that matched this condition, either solution would probably perform just fine.

But what if you had tens of millions of users in your system, and this query happened to match 10% of them? (OK, perhaps this particular scenario is a bit contrived, but there *are* plenty of applications out there where the number of many-to-many relationships order in the tens of millions or more.) The logic would have to load more than a million users across the wire from the database which, as a result, might require you to implement a manual batching mechanism. You would load, say, 1000 users into memory at once, operate on them, flush/clear the session, then load the next batch, and so on. Memory requirements aside, you might find the transaction takes too long or might even time-out. In this case, you would need to execute each of the batches inside its own transaction, driving the process from outside of a transactional context.

Unfortunately, the data-load isn’t the only issue. The actual deletion work has problems too. You’re going to have to, for each user in turn, remove all of its roles (e.g., “user.getRoles().clear()”) and then delete the user itself (e.g., “entityManager.remove(user)”). These operations translate into two native SQL delete statements for each matched user – one to remove the related entries from the app_user_role table, and the other to remove the user itself from the app_user table).

All of these performance issues stem from the fact that a large amount of data has to be loaded across the wire and then manipulated, which results in a number of roundtrips proportional to the number of rows that match the criteria. However, by creating the mapping entity, it becomes possible to execute everything in two statements, neither of which even load data across the wire.

So what’s the right solution? Well, the interesting thing about this problem space is that the two solutions described above are not mutually exclusive. There’s nothing that prevents you from using both of them simultaneously:

public class UserRolePK {
    @ManyToOne
    @JoinColumn(name = "USER_ID", referencedColumnName = "ID")
    private User user;

    @ManyToOne
    @JoinColumn(name = "ROLE_ID", referencedColumnName = "ID")
    private Role role;
}

@Entity @IdClass(UserRolePK.class) 
@Table(name = "APP_USER_ROLE")
public class UserRole {
    @Id
    private User user;

    @Id
    private Role role;
}

@Entity 
@Table(name = "APP_USER")
public class User {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "user")
    private Set<UserRole> userRoles;
    
    @ManyToMany
    @JoinTable(name = "APP_USER_ROLE", 
       joinColumns = { @JoinColumn(name = "USER_ID") }, 
       inverseJoinColumns = { @JoinColumn(name = "ROLE_ID") })
    private Set<Role> roles = new HashSet<Role>();
}

@Entity 
@Table(name = "APP_ROLE")
public class Role {
    @Id
    private Integer id;
    
    @OneToMany(mappedBy = "role")
    private Set<UserRole> userRoles;
    
    @ManyToMany(mappedBy = "roles")
    private Set<User> users = new HashSet<User>();
}

This hybrid solution actually gives you the best of both worlds: elegant queries and efficient updates to the linking table. Granted, the boilerplate to set up all the mappings might seem tedious, but that extra effort is well worth the pay-off.

Advertisements

Written by josephmarques

February 22, 2010 at 12:52 pm

Posted in hibernate

Tagged with

Web Development Tips – automate the little things

with 3 comments

I recall a colleague of mine mentioning several weeks ago that it’s annoying to have to log into RHQ every time you redeploy UI code that causes portal-war’s web context to reload. I completely agreed at the time, but it wasn’t until today that I finally got annoyed enough to look for a workaround myself. Here’s the solution I ended up with:

1) Use FireFox
2) Download and install GreaseMonkey
3) Install the AutoLogin script
4) Log into “http://localhost:7080/Login.do&#8221; and make sure to tell FF to remember your password
5) Test that auto-login is working properly by logging out of the application…you should be forwarded to the login page, which FF will automatically fill in with your saved credentials, and the grease monkey script will perform the login for you

This should also work when you get logged out due to session expiry. The expiry handler will redirect you back to /Login.do, which will now automatically log you back in and – on a best effort basis – redirect you back to the last “valid” page you were on. RHQ has a mechanism for recording the last couple of pages you visited (see WebUserTrackingFilter) and will try them in most-recently-visited order until it finds a page that doesn’t blow up with JSF’s “classic” ViewExpiredException. I discuss the details of how this mechanism works in my other post.

Note: if you ever want to log into localhost with a different user, all you have to do is click the GreaseMonkey icon (on the far right-hand side of the status bar at the bottom of your browser) and you’ll temporarily disable the AutoLogin script from executing.

How would you solve this? How have you solved this? I’m eager to read your post backs.

Written by josephmarques

February 5, 2010 at 4:19 pm

Posted in webdev

Tagged with

Java IO Performance

with one comment

I thought I’d share a good article I just read on IO performance in a Java environment. It compares buffered versus non-buffered streams, both of which are synchronized for thread-safety, versus file channels which are unsynchronized. Going one step further, it also benchmarks the effect that buffer sizes (or “chunked” reads) have in each of those scenarios, as well as the consequence of memory-mapping those buffers.

The comparison to some of the predominant ways to read files in C towards the end adds some nice perspective.

http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly

Written by josephmarques

September 29, 2009 at 9:20 pm

Posted in java

Tagged with

Jopr has embedded database support!

with 6 comments

Jopr now supports an embedded database option during installation (using H2 under the covers). This option, aside from preparing the embedded database, also configures an embedded agent running inside the same VM. This means it takes only a few keystrokes now to install an entire Jopr system, end-to-end.

The bits can be found here.

Quick install steps:

  • download & unzip the jopr-server-2.2.1.zip binary
  • start the server using scripts at the top-level bin dir
  • go to http://localhost:7080/
  • click the “Embedded Mode” button at the top at the page (this enables the embedded db *and* the embedded agent)
  • click the install button at the bottom of the page

Comments and questions welcome on the Jopr forums.

Note: the embedded database and embedded agent options are not intended production use. They were developed to facilitate quick installs so that new users could easily demo the product, or existing users get a quick overview of what features the latest distribution has to offer. At the time of this writing, there is no upgrade support when choosing the embedded mode option.

Written by josephmarques

June 3, 2009 at 5:15 pm

Posted in rhq

Tagged with

Don’t reinvent the wheel

with 2 comments

I can’t help but relish from time to time how powerful the Jopr / RHQ platform is. When you look at a product like Tomcat you might think that to write an end-to-end management framework around it would take a few man years at best.

Not so with Jopr / RHQ. Instead of spending time worrying about the ins and outs of how to move metric data around your network (and subsequently graph it), how to send a remote method execution down to your server, how to be notified when certain events take place on your servers, how to audit what users do and/or control who does what – you get all of that out-of-box with Jopr / RHQ.

So what does this all mean? It means that by leveraging a robust management platform like Jopr / RHQ you get to focus on what’s important – the servers you’re managing.

Jay Shaughnessy, a colleague of mine, provided the most recent proof of these claims. He wrote a plugin for end-to-end management of Tomcat. By leveraging Jopr / RHQ, he was able to accomplish an impressive amount in a relatively short period of time. But I’ll let him tell you about it…

http://jayshaughnessy.blogspot.com/2009/04/jopr-22-adds-tomcat-management.html

Written by josephmarques

May 1, 2009 at 9:09 pm

Posted in rhq

Tagged with

Jopr 2.2.0 released!

leave a comment »

Jopr 2.2.0 has finally been posted to SourceForge!

I’m happy to say that if you download the bits today, assuming you’ve been a user since it’s initial release, you may very well have trouble recognizing that you’re using the same product as before. Although this release was originally focused on just two items – enhancements to support cluster-oriented views and support for monitoring/managing external Tomcat instances – so much more was accomplished.

As you’ll see below, and as the 2.2 sneak peak clearly revealed, usability inadvertently became a focus of this release – a new menu bar with resource and group search functions, resource and group navigation trees with right-click context menus, and a sticky tabbing infrastructure are just some of the notable enhancements.

I’m especially pleased with the number of performance fixes that made it into this release. These weren’t just constrained to back-end processes or how the agents and servers communicate, but several people have been testing the 2.2.0 BETA and have noted they can feel the difference in the user interface. If there’s any one thing that would frustrate me about using a software product I’d have to say it’s how long I wait between when I click something in the UI and when I receive feedback about that action. So we hope you – the community – appreciate the time we took to improve that part of the overall experience for you.

For those that like details, I’ve again gone through JIRA and scanned the more than 600 issues resolved this release. Below is a summary of all of the noteworthy changes:

UI Enhancements

  • complete rewrite of event history and / monitoring UI pages from struts/tiles to jsf/facelets for resources, groups, and autogroups
  • addition of availability history page for resources
  • addition of xmas tree lights for autogroups and compatible groups
  • improve display of availability and membership counts for groups
  • brand new menu bar to replace old, flat link menu
  • brand new resource and group tree navigation components, with right-click context menu support
  • subsystem views – config updates, operation history, new oob metrics, alert definitions and alert history
  • several other markers/icons added to the overview>timeline page, and fixed it so it works in IE6
  • sticky tabs when navigating between resources and a split pane divider that remembers it’s relative percentage location
  • gave our tabbing infrastructure a facelift, moved from image rollovers to pure css, and are now using a wicked cool gradient
  • revamped resource and group favorites into the menu bar
  • support for recently visited resources and groups in the menu bar
  • auto-complete / search for resources and groups in the menu bar
  • several fixes for ops and alerts pages to improve usability
  • added new resource summary page
  • updated nearly all of our icons across the entire site
  • numerous IE6 and IE7 fixes

Notable Features

  • self-healing agents when they detect they are sick
  • clean up of authorization rules across several pages in the app
  • change to how authz is done – explicit res group denotes membership, while implicit res group is solely for authorization
  • dynagroup querying off of current res availability; also support for parent / grandparent / child context querying
  • support for dynagroup filtering and pivoting off of properties will NULL values
  • auto-recalculation of dynagroups, added some dynagroup “templates” too
  • proper use of XA transactions
  • config change detection + alerting off of that
  • support for aggregate config w/group res config & group plugin config implementations

Plugin Development / Deployment Improvements

  • standalone tomcat and EWS support
  • fix hot deploy of plugins
  • new plugin generator
  • pushed deploy of plugins to DB now, not just filesystem
  • lenient deployment of plugins whose dependency graph is not satisfied

Caching / Performance Tuning

  • fixed paginatedDataTables so they only load once (page-scoped caching)
  • all user preferences cached to the session
  • MICA icon generation scheme doesn’t require DB hits at all – all done via in-memory lookup
  • solved a dozen or so N+1 (or worse) query issues
  • revamped measurement out-of-bounds system
  • fine-grained updates for measurement baseline calculations
  • precompute of current availability for resources
  • replaced resource group SLSB methods with native sql solutions, supporting recursive rules
  • alerts cache rewritten to be near-lockless, and reloading cache doesn’t block readers
  • upgraded quartz library, resolved qrtz_table locking issues
  • using sigar proxy cache
  • async committal of measurement data on postgres
  • events throttling

So download it, try it out, and ping back here or on the Jopr forums if you have any questions.

Written by josephmarques

April 30, 2009 at 1:55 pm

Posted in rhq

Tagged with

Custom JSF/Facelet Exception Handling

with 13 comments

Have you ever used a JSF-based application, navigated to some page, only to see a nasty error?

This is what you get out-of-box with facelets. Most of the time it happens when the facelet tries to resolve some EL expression, needs to create some JSF managed bean, but one or more required URL parameters are either missing or have invalid values.

In a development environment, it makes sense to show this page because the various pieces of contextual information (full stack trace + JSF component tree + variables in scope) provide plenty of clues with which to diagnose the issue. However, when you ship a product to a customer or push your changes to a production environment, it would be nice to change the behavior and provide a pleasant error page to the user.

Fortunately, the facelets framework makes overriding this default behavior incredibly simple. The basic premise is to redirect to a custom error page so you can provide a layout that hides the unappealing stack trace, but which still provides a link to view those details (primarily so your customers can report the bugs back to you).

Note: the following code examples will be pulled directly from the RHQ / Jopr code base.

The first step is to add a custom view handler to your web application. Open up the faces-config.xml file and add a custom view handler:

<faces-config ...>
   <application>
      <view-handler>org.rhq.enterprise.gui.common.framework.FaceletRedirectionViewHandler</view-handler>
      ...
   </application>
</faces-config>

Then override the default mechanism for dealing with errors that the facelet framework encounters:

public class FaceletRedirectionViewHandler extends FaceletViewHandler {
    ...
    @Override
    protected void handleRenderException(FacesContext context, Exception ex) throws IOException, ELException,
        FacesException {
        try {
            if (context.getViewRoot().getViewId().equals("/rhq/common/error.xhtml")) {
                /*
                 * This is to protect from infinite redirects if the error
                 * page itself is updated in the future and has an error
                 */
                log.error("Redirected back to ourselves, there must be a problem with the error.xhtml page", ex);
                return;
            }

            getSessionMap().put("GLOBAL_RENDER_ERROR", ex);
            getHttpResponseObject().sendRedirect("/rhq/common/error.xhtml");
        } catch (IOException ioe) {
            log.fatal("Could not process redirect to handle application error", ioe);
        }
    }
}

View full source here

The basic strategy is capture the exception and redirect to your new error page. However, what if there is a compile error on the error page itself? Your new error handling code would capture that error and try to handle it, which would redirect back to the custom error page which still has that error on it! This leads to an infinite redirect, which all modern browsers can and will detect, but it doesn’t provide much useful information as to what error occurred on the page.

You might be asking yourself, “How likely is it that the error page will have an error?” Well, this could happen either as you’re writing the page for the first time, or are updating the page in the future to add other features. Fortunately, there is an easy way of protecting against this. In your error handling code, simply test which JSF viewId you’re coming from and, if it’s your new error page, then revert back to the default handler by calling the superclass method being overridden. Don’t forget to explicitly break/return out of the custom handler, otherwise you’ll still see the infinite recursion.

The next and final step is to write the logic that pulls the exception back out of the session map on the other side of the redirect:

public class GenericErrorUIBean {
    String summary; // the name of the exception class, usually self-descriptive
    String details; // a little more information about the named exception
    List<Tuple> trace; // fine-grained trace details

    public GenericErrorUIBean() {
        trace = new ArrayList<Tuple>();
        Throwable ex = (Exception) getSessionMap().remove("GLOBAL_RENDER_ERROR");

        String message = ex.getLocalizedMessage();
        String stack = StringUtil.getFirstStackTrace(ex);
        trace.add(new Tuple(message, stack));
        while (ex.getCause() != null) {
            ex = ex.getCause();
            message = ex.getLocalizedMessage();
            stack = StringUtil.getFirstStackTrace(ex);
            trace.add(new Tuple(message, stack));
        }

        summary = ex.getClass().getSimpleName();
        details = ex.getMessage();
    }

    private static String getFirstStackTrace(Throwable t) {
        if (t == null) {
            return null;
        }
        StringWriter sw = new StringWriter();
        PrintWriter pw = new PrintWriter(sw);
        t.printStackTrace(pw);
        return sw.toString();
    }
}

View full source here

Depending on what you want your error page to look like, you can use a variety of methods for chopping up the stack trace into more manageable and/or user-friendly bits. In this case, the solution I used for the RHQ platform loops over the full trace one stack frame at a time. At each frame, we record the name of the exception-class and map that to the exception-stack at that frame. In this fashion, we can generate of list of tuples which can be iterated over in our error.xhtml facelet with either the ui:repeat or a4j:repeat tag.

The end result is a page that looks at professional by hiding the ugly errors and showing the root cause in small, easy to understand language.
Note, however, that the page can still be used as a debugging tool. See the “view the stack trace” link at the bottom? Clicking it opens up a modal dialog which display the full stack trace (which is useful for reporting bugs through customer support [for downloaded products], or emailing the webmaster [for hosted applications]).

For brevity, the source code of the error.xhtml facelet has been left out of this blog entry, but you can view the full source here

This is just one of the many solutions employed by the RHQ platform to provide a great web driven interface that centralizes the monitoring and management of your enterprise systems. To find out more, please visit one of the links below:

RHQ (base management platform) – http://www.rhq-project.org/
Jopr (Jboss specific extensions to RHQ) – http://www.jboss.org/jopr/

Written by josephmarques

April 27, 2009 at 7:29 am

Posted in webdev

Tagged with