Java 如何使用JPA(或至少使用Hibernate)处理大型数据集?
我需要让我的web应用程序能够处理非常大的数据集。目前,我要么得到OutOfMemoryException,要么得到1-2分钟生成的输出 让我们简单地说,假设我们在DB中有两个表:Java 如何使用JPA(或至少使用Hibernate)处理大型数据集?,java,performance,hibernate,jpa,jakarta-ee,Java,Performance,Hibernate,Jpa,Jakarta Ee,我需要让我的web应用程序能够处理非常大的数据集。目前,我要么得到OutOfMemoryException,要么得到1-2分钟生成的输出 让我们简单地说,假设我们在DB中有两个表:Worker和WorkLog,第一个表中大约有1000行,第二个表中有10000行。后一个表有几个字段,包括“workerId”和“hoursWorked”字段。我们需要的是: 计算每个用户的总工作小时数 每个用户的工作周期列表 对于纯SQL中的每个任务,最直接的方法(IMO)是: (一) 从Worker、WorkLo
Worker
和WorkLog
,第一个表中大约有1000行,第二个表中有10000行。后一个表有几个字段,包括“workerId”和“hoursWorked”字段。我们需要的是:
从Worker、WorkLog中选择Worker.name、sum(hoursWorked)
其中Worker.id=WorkLog.workerId
按Worker.name分组;
//此查询的结果应转换为Multimap
(二)
选择Worker.name、WorkLog.start、WorkLog.hours从Worker开始工作、WorkLog
其中Worker.id=WorkLog.workerId;
//此查询的结果应转换为Multimap
//如果它是JDBC,那么它将是至关重要的
//要设置resultSet.setFetchSize(someSmallNumber),请使用~100
因此,我有两个问题:
无状态会话
没有
与之关联的持久性上下文
并没有提供许多
更高层次的生命周期语义。在里面
尤其是无状态会话
不实现一级缓存或
与任何第二级或第三级交互
查询缓存。它没有实现
事务性写后或
自动脏检查。操作
使用无状态会话执行
永远不要级联到关联的实例。
集合被无状态对象忽略
一场通过计算机执行的操作
无状态会话绕过Hibernate的
事件模型和拦截器。由于
缺少一级缓存,
无状态会话容易受到攻击
数据混叠效果。无国籍者
会话是较低级别的抽象
这更接近于根本原因
JDBC
在此代码示例中,客户
查询返回的实例是
立即分离。他们从来都不是
与任何持久性相关联
上下文
insert()、update()
和
delete()
由
无状态会话
接口
被认为是直接数据库
行级操作。他们导致
SQL语句的立即执行
插入、更新
或删除
分别地他们有不同的想法
save()的语义,
保存或更新()
和删除()
由会话定义的操作
接口
原始SQL不应被视为最后的手段。如果您希望在JPA层(而不是数据库层)上保持事物的“标准”,那么它仍然应该被视为一个选项。JPA还支持本机查询,在本机查询中,它仍将为您执行到标准实体的映射
但是,如果您有一个无法在数据库中处理的大型结果集,那么您真的应该只使用普通的JDBC,因为JPA(标准)不支持大型数据集的流
如果您使用JPA实现特定的构造,那么在不同的应用程序服务器之间移植应用程序将更加困难,因为JPA引擎嵌入到应用程序服务器中,并且您可能无法控制使用哪个JPA提供程序。我正在使用类似的东西,它工作得非常快。我也讨厌使用原生SQL,因为我们的应用程序应该可以在任何数据库上运行
下面将Resultl转换为一个非常优化的sql,并返回映射记录列表
String hql = "select distinct " +
"t.uuid as uuid, t.title as title, t.code as code, t.date as date, t.dueDate as dueDate, " +
"t.startDate as startDate, t.endDate as endDate, t.constraintDate as constraintDate, t.closureDate as closureDate, t.creationDate as creationDate, " +
"sc.category as category, sp.priority as priority, sd.difficulty as difficulty, t.progress as progress, st.type as type, " +
"ss.status as status, ss.color as rowColor, (p.rKey || ' ' || p.name) as project, ps.status as projectstatus, (r.code || ' ' || r.title) as requirement, " +
"t.estimate as estimate, w.title as workgroup, o.name || ' ' || o.surname as owner, " +
"ROUND(sum(COALESCE(a.duration, 0)) * 100 / case when ((COALESCE(t.estimate, 0) * COALESCE(t.progress, 0)) = 0) then 1 else (COALESCE(t.estimate, 0) * COALESCE(t.progress, 0)) end, 2) as factor " +
"from " + Task.class.getName() + " t " +
"left join t.category sc " +
"left join t.priority sp " +
"left join t.difficulty sd " +
"left join t.taskType st " +
"left join t.status ss " +
"left join t.project p " +
"left join t.owner o " +
"left join t.workgroup w " +
"left join p.status ps " +
"left join t.requirement r " +
"left join p.status sps " +
"left join t.iterationTasks it " +
"left join t.taskActivities a " +
"left join it.iteration i " +
"where sps.active = true and " +
"ss.done = false and " +
"(i.uuid <> :iterationUuid or it.uuid is null) " + filterHql +
"group by t.uuid, t.title, t.code, t.date, t.dueDate, " +
"t.startDate, t.endDate, t.constraintDate, t.closureDate, t.creationDate, " +
"sc.category, sp.priority, sd.difficulty, t.progress, st.type, " +
"ss.status, ss.color, p.rKey, p.name, ps.status, r.code, r.title, " +
"t.estimate, w.title, o.name, o.surname " + sortHql;
if (logger.isDebugEnabled()) {
logger.debug("Executing hql: " + hql );
}
Query query = hibernateTemplate.getSessionFactory().getCurrentSession().getSession(EntityMode.MAP).createQuery(hql);
for(String key: filterValues.keySet()) {
Object valueSet = filterValues.get(key);
if (logger.isDebugEnabled()) {
logger.debug("Setting query parameter for " + key );
}
if (valueSet instanceof java.util.Collection<?>) {
query.setParameterList(key, (Collection)filterValues.get(key));
} else {
query.setParameter(key, filterValues.get(key));
}
}
query.setString("iterationUuid", iteration.getUuid());
query.setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);
if (logger.isDebugEnabled()) {
logger.debug("Query building complete.");
logger.debug("SQL: " + query.getQueryString());
}
return query.list();
String hql=“选择不同”+
“t.uuid作为uuid,t.title作为title,t.code作为code,t.date作为date,t.dueDate作为dueDate,”+
“t.startDate作为startDate,t.endDate作为endDate,t.constraintDate作为constraintDate,t.closureDate作为closureDate,t.creationDate作为creationDate,”+
“sc.类别为类别,sp.优先级为优先级,sd.难度为难度,t.进度为进度,st.类型为类型”+
“ss.status作为status,ss.color作为rowColor,(p.rKey | p.name)作为project,ps.status作为projectstatus,(r.code | r.title)作为requirement,”+
“t.估价作为估价,w.头衔作为工作组,o.姓名| |”“o.姓氏作为所有者,”+
“当((合并(t.estimate,0)*合并(t.progress,0))=0时,四舍五入(总和(合并(a.duration,0))*100/例),然后1个其他(合并(t.estimate,0)*合并(t.progress,0))结束,2)作为系数”+
“来自”+Task.class.getName()+“t”+
“左连接t.类别sc”+
“左连接t.priority sp”+
“左连接t.sd”+
“左连接t.taskType st”+
“左连接t.ss状态”+
“左连接t.p项目”+
“左连接t.o”+
“左加入t工作组w”+
“左连接p.status ps”+
“左连接t.r”+
“左加入p.status SP”+
“left join t.iterationTasks它”+
“左加入t.a”+
“left join it.i”+
“其中sps.active=true,并且”+
“ss.done=false和”+
“(i.uuid:iterationUuid或it.uuid为空)”+filterHql+
select Worker.name, WorkLog.start, WorkLog.hoursWorked from Worker, WorkLog
where Worker.id = WorkLog.workerId;
//results of this query should be transformed to Multimap<Worker, Period>
//if it was JDBC then it would be vitally
//to set resultSet.setFetchSize (someSmallNumber), ~100
StatelessSession session = sessionFactory.openStatelessSession();
Transaction tx = session.beginTransaction();
ScrollableResults customers = session.getNamedQuery("GetCustomers")
.scroll(ScrollMode.FORWARD_ONLY);
while ( customers.next() ) {
Customer customer = (Customer) customers.get(0);
customer.updateStuff(...);
session.update(customer);
}
tx.commit();
session.close();
String hql = "select distinct " +
"t.uuid as uuid, t.title as title, t.code as code, t.date as date, t.dueDate as dueDate, " +
"t.startDate as startDate, t.endDate as endDate, t.constraintDate as constraintDate, t.closureDate as closureDate, t.creationDate as creationDate, " +
"sc.category as category, sp.priority as priority, sd.difficulty as difficulty, t.progress as progress, st.type as type, " +
"ss.status as status, ss.color as rowColor, (p.rKey || ' ' || p.name) as project, ps.status as projectstatus, (r.code || ' ' || r.title) as requirement, " +
"t.estimate as estimate, w.title as workgroup, o.name || ' ' || o.surname as owner, " +
"ROUND(sum(COALESCE(a.duration, 0)) * 100 / case when ((COALESCE(t.estimate, 0) * COALESCE(t.progress, 0)) = 0) then 1 else (COALESCE(t.estimate, 0) * COALESCE(t.progress, 0)) end, 2) as factor " +
"from " + Task.class.getName() + " t " +
"left join t.category sc " +
"left join t.priority sp " +
"left join t.difficulty sd " +
"left join t.taskType st " +
"left join t.status ss " +
"left join t.project p " +
"left join t.owner o " +
"left join t.workgroup w " +
"left join p.status ps " +
"left join t.requirement r " +
"left join p.status sps " +
"left join t.iterationTasks it " +
"left join t.taskActivities a " +
"left join it.iteration i " +
"where sps.active = true and " +
"ss.done = false and " +
"(i.uuid <> :iterationUuid or it.uuid is null) " + filterHql +
"group by t.uuid, t.title, t.code, t.date, t.dueDate, " +
"t.startDate, t.endDate, t.constraintDate, t.closureDate, t.creationDate, " +
"sc.category, sp.priority, sd.difficulty, t.progress, st.type, " +
"ss.status, ss.color, p.rKey, p.name, ps.status, r.code, r.title, " +
"t.estimate, w.title, o.name, o.surname " + sortHql;
if (logger.isDebugEnabled()) {
logger.debug("Executing hql: " + hql );
}
Query query = hibernateTemplate.getSessionFactory().getCurrentSession().getSession(EntityMode.MAP).createQuery(hql);
for(String key: filterValues.keySet()) {
Object valueSet = filterValues.get(key);
if (logger.isDebugEnabled()) {
logger.debug("Setting query parameter for " + key );
}
if (valueSet instanceof java.util.Collection<?>) {
query.setParameterList(key, (Collection)filterValues.get(key));
} else {
query.setParameter(key, filterValues.get(key));
}
}
query.setString("iterationUuid", iteration.getUuid());
query.setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);
if (logger.isDebugEnabled()) {
logger.debug("Query building complete.");
logger.debug("SQL: " + query.getQueryString());
}
return query.list();
select w, sum(wl.hoursWorked)
from Worker w, WorkLog wl
where w.id = wl.workerId
group by w
select w, sum(wl.hoursWorked)
from Worker w join w.workLogs wl
group by w
select new WorkerTotal( select w, sum(wl.hoursWorked) )
from Worker w join w.workLogs wl
group by w
select new WorkerTotal( select w.id, w.name, sum(wl.hoursWorked) )
from Worker w join w.workLogs wl
group by w.id, w.name
select w, new Period( wl.start, wl.hoursWorked )
from Worker w join w.workLogs wl
Query query = em.createQuery...
query.setHint(QueryHints.CURSOR, true)
.setHint(QueryHints.SCROLLABLE_CURSOR, true)
ScrollableCursor scrl = (ScrollableCursor)q.getSingleResult();
Object o = null;
while ((o = scrl.next()) != null) { ... }