Wikipedia

Search results

Thursday, 13 September 2018

DeadLock Identification from a Single Javacore File in WebSphere on Linux .

Sample Javacore Extracts and Locking Explanation for a DeadLock Condition:

Deadlock detected !!!

Thread "WebContainer : 6" (0x0000000009306B00)
is waiting for:
sys_mon_t:0x00002B9F669C9D50 infl_mon_t: 0x00002B9F669C9DC8:
java/lang/Object@0x000000002D7D8810
which is owned by:
Thread "Agent Heartbeat" (0x0000000000808400)
which is waiting for:
sys_mon_t:0x00002B9F681EA980 infl_mon_t: 0x00002B9F681EA9F8:
com/ibm/ws/session/store/memory/MemorySession@0x000000011EF1EF18
which is owned by:
Thread "WebContainer : 29" (0x000000000A4E7500)
which is waiting for:
sys_mon_t:0x00002B9F6AA7A7C0 infl_mon_t: 0x00002B9F6AA7A838:
org/apache/log4j/spi/RootLogger@0x0000000025878DC0
which is owned by:
Thread "WebContainer : 6" (0x0000000009306B00)
 

The above entries from the LOCKS section of Javacore.txt shows a DeadLock condition happened between (3) threads ' WebContainer : 6  ', ' Agent Heartbeat ' and ' WebContainer : 29 '.
Note:  When reviewing Systremout.log from WebSphere the profile/logs directory, DeadLocks will most probably show up as WSRV0605W hung thread warning messages and can only be diagnosed as a "DeadLock" when reviewing the Javacores in either a text editor or using specialized tooling such as IBM Thread and Monitor Dump Analyzer for Java.
 

-->>  WebContainer : 6 thread shows an exception happened at spring framework at org/apache/log4j/Category.callAppenders(Category.java:203(Compiled Code))
which is being logged by org/apache/log4j by acquiring the ' root logger ' lock and then this thread is calling a mail service and then being introspected by Wily.  For introspection, Wily needs the ' StallSweeper ' lock which is being owned by the " Agent Heartbeat " thread.

Thread Name
WebContainer : 6
State
Deadlock/Blocked

Monitor
Owns Monitor Lock on org/apache/log4j/spi/RootLogger@0x0000000025878DC0
Waiting for Monitor Lock on java/lang/Object@0x000000002D7D8810

Java Stack
at com/wily/introscope/agent/blamestackfeature/BlameStackFeatureBlameStack.IBlameStack_addExtraParameter(BlameStackFeatureBlameStack.java:105(Compiled Code))
at com/wily/introscope/agent/blame/DuplicateHandlingBlameStack.IBlameStack_addExtraParameter(DuplicateHandlingBlameStack.java:60(Compiled Code))
at com/wily/introscope/agent/blame/CompoundBlameStack.IBlameStack_addExtraParameter(CompoundBlameStack.java:165(Compiled Code))
at com/wily/introscope/agent/blame/ComponentTracer.addExtraParameter(ComponentTracer.java:328(Compiled Code))
at com/wily/introscope/agent/blame/ComponentTracer.addExtraParameter(ComponentTracer.java:316(Compiled Code))
at com/wily/introscope/agent/trace/io/SocketBackendTracer.annotateBlameStack(SocketBackendTracer.java:153(Compiled Code))
at java/net/ManagedSocketInputStreamHighPerformance.annotateBlameStack(ManagedSocketInputStreamHighPerformance.java:328(Compiled Code))
at java/net/ManagedSocketInputStreamHighPerformance.read(ManagedSocketInputStreamHighPerformance.java:231(Compiled Code))
at com/sun/mail/util/TraceInputStream.read(TraceInputStream.java:106(Compiled Code))
at java/io/BufferedInputStream.fill(BufferedInputStream.java:229(Compiled Code))
at java/io/BufferedInputStream.read(BufferedInputStream.java:248(Compiled Code))
at com/sun/mail/util/LineInputStream.readLine(LineInputStream.java:84(Compiled Code))
at com/sun/mail/smtp/SMTPTransport.readServerResponse(SMTPTransport.java:1742(Compiled Code))
at com/sun/mail/smtp/SMTPTransport.openServer(SMTPTransport.java:1523(Compiled Code))
at com/sun/mail/smtp/SMTPTransport.protocolConnect(SMTPTransport.java:453(Compiled Code))
at javax/mail/Service.connect(Service.java:291(Compiled Code))
at javax/mail/Service.connect(Service.java:172(Compiled Code))
at javax/mail/Service.connect(Service.java:121(Compiled Code))
at javax/mail/Transport.send0(Transport.java:190(Compiled Code))
at javax/mail/Transport.send(Transport.java:120(Compiled Code))
at org/apache/log4j/net/SMTPAppender.sendBuffer(Bytecode PC:214(Compiled Code))
at org/apache/log4j/net/SMTPAppender.append(Bytecode PC:56)
at org/apache/log4j/AppenderSkeleton.doAppend(AppenderSkeleton.java:230(Compiled Code))
at org/apache/log4j/helpers/AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65(Compiled Code))
at org/apache/log4j/Category.callAppenders(Category.java:203(Compiled Code))
at org/apache/log4j/Category.forcedLog(Category.java:388(Compiled Code))
at org/apache/log4j/Category.error(Category.java:302)
at com/applicationX/pocs/ui/web/controllers/ExceptionHandler.resolveException(ExceptionHandler.java:61)
at org/springframework/web/servlet/DispatcherServlet.processHandlerException(DispatcherServlet.java:1120(Compiled Code))
at org/springframework/web/servlet/DispatcherServlet.doDispatch(DispatcherServlet.java:944(Compiled Code))
at org/springframework/web/servlet/DispatcherServlet.doService(DispatcherServlet.java:852(Compiled Code))
at org/springframework/web/servlet/FrameworkServlet.processRequest(FrameworkServlet.java:882(Compiled Code))
at org/springframework/web/servlet/FrameworkServlet.doGet(FrameworkServlet.java:778(Compiled Code))
at javax/servlet/http/HttpServlet.service(HttpServlet.java:575(Compiled Code))
at javax/servlet/http/HttpServlet.service(HttpServlet.java:668(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapper.service(ServletWrapper.java:1227(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapper.handleRequest(ServletWrapper.java:776(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapper.handleRequest(ServletWrapper.java:458(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapperImpl.handleRequest(ServletWrapperImpl.java:178(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.invokeTarget(WebAppFilterChain.java:136(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:97(Compiled Code))
at com/applicationX/pocs/ui/web/util/SessionExpiryFilter.doFilter(SessionExpiryFilter.java:60(Compiled Code))
at com/ibm/ws/webcontainer/filter/FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:195(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:91(Compiled Code))
at com/applicationX/pocs/ui/web/util/CommonFilter.doFilter(CommonFilter.java:44(Compiled Code))
at com/ibm/ws/webcontainer/filter/FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:195(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:91(Compiled Code))
at com/applicationX/pocs/ui/web/util/GZipServletFilter.doFilter(GZipServletFilter.java:42(Compiled Code))
at com/ibm/ws/webcontainer/filter/FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:195(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:91(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterManager.doFilter(WebAppFilterManager.java:928(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1025(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebApp.handleRequest(WebApp.java:3761(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebGroup.handleRequest(WebGroup.java:304(Compiled Code))
at com/ibm/ws/webcontainer/WebContainer.handleRequest(WebContainer.java:976(Compiled Code))
at com/ibm/ws/webcontainer/WSWebContainer.handleRequest(WSWebContainer.java:1662(Compiled Code))
at com/ibm/ws/webcontainer/channel/WCChannelLink.ready(WCChannelLink.java:200(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.handleDiscrimination(HttpInboundLink.java:459(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.handleNewRequest(HttpInboundLink.java:526(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.processRequest(HttpInboundLink.java:312(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.ready(HttpInboundLink.java:283(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLConnectionLink.determineNextChannel(SSLConnectionLink.java:1048(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLConnectionLink.readyInboundPostHandshake(SSLConnectionLink.java:716(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLConnectionLink$MyHandshakeCompletedCallback.complete(SSLConnectionLink.java:412(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLUtils.handleHandshake(SSLUtils.java:1066(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLHandshakeIOCallback.complete(SSLHandshakeIOCallback.java:87(Compiled Code))
at com/ibm/ws/tcp/channel/impl/AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:175(Compiled Code))
at com/ibm/io/async/AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217(Compiled Code))
at com/ibm/io/async/AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161(Compiled Code))
at com/ibm/io/async/AsyncFuture.completed(AsyncFuture.java:138(Compiled Code))
at com/ibm/io/async/ResultHandler.complete(ResultHandler.java:204(Compiled Code))
at com/ibm/io/async/ResultHandler.runEventProcessingLoop(ResultHandler.java:816(Compiled Code))
at com/ibm/io/async/ResultHandler$2.run(ResultHandler.java:905(Compiled Code))
at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1862(Compiled Code))

Waiting Threads:  36
Thread-76
WebContainer : 0
WebContainer : 1
WebContainer : 10
WebContainer : 11
WebContainer : 15
WebContainer : 17
WebContainer : 19
WebContainer : 2
WebContainer : 20
WebContainer : 21
WebContainer : 22
WebContainer : 23
WebContainer : 24
WebContainer : 25
WebContainer : 26
WebContainer : 29
WebContainer : 3
WebContainer : 31
WebContainer : 32
WebContainer : 36
WebContainer : 37
WebContainer : 39
WebContainer : 4
WebContainer : 40
WebContainer : 41
WebContainer : 45
WebContainer : 46
WebContainer : 47
WebContainer : 48
WebContainer : 49
WebContainer : 5
WebContainer : 7
WebContainer : 8
WebContainer : 9
pool-13-thread-1

Blocked by:  1
Agent Heartbeat



-->> Agent Heartbeat thread is on progress with introspection by owning the ' StallSweeper ' lock but needs the ' MemorySession ' lock to complete the process, which is being owned by the " WebContainer : 29 " thread.

Thread Name
Agent Heartbeat
State
Deadlock/Blocked

Monitor
Owns Monitor Lock on java/lang/Object@0x000000002D7D8810
Waiting for Monitor Lock on com/ibm/ws/session/store/memory/MemorySession@0x000000011EF1EF18

Java Stack
at com/ibm/ws/session/store/memory/MemorySession.updateLastAccessTime(MemorySession.java:638(Compiled Code))
at com/ibm/ws/session/store/memory/MemoryStore.getSession(MemoryStore.java:194(Compiled Code))
at com/ibm/ws/session/store/memory/MemoryStore.getSession(MemoryStore.java:712(Compiled Code))
at com/ibm/ws/session/SessionManager.getSessionFromStore(SessionManager.java:497(Compiled Code))
at com/ibm/ws/session/SessionManager.getSession(SessionManager.java:476(Compiled Code))
at com/ibm/ws/session/SessionManager.getSession(SessionManager.java:462(Compiled Code))
at com/ibm/ws/session/SessionManager.getSession(SessionManager.java:693(Compiled Code))
at com/ibm/ws/session/SessionContext.getIHttpSession(SessionContext.java:466(Compiled Code))
at com/ibm/ws/session/SessionContext.getIHttpSession(SessionContext.java:426(Compiled Code))
at com/ibm/ws/webcontainer/srt/SRTRequestContext.getSession(SRTRequestContext.java:104(Compiled Code))
at com/ibm/ws/webcontainer/srt/SRTServletRequest.getSession(SRTServletRequest.java:2152(Compiled Code))
at sun/reflect/GeneratedMethodAccessor406.invoke(Bytecode PC:65(Compiled Code))
at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37(Compiled Code))
at java/lang/reflect/Method.invoke(Method.java:611(Compiled Code))
at com/wily/introscope/agent/trace/servlet/ObjectWrapper.invokeMethodOnObject(ObjectWrapper.java:102(Compiled Code))
at com/wily/introscope/agent/trace/servlet/RequestWrapper.getSession(RequestWrapper.java:84(Compiled Code))
at com/wily/introscope/agent/trace/servlet/ServletParameterLoader.addSessionID(ServletParameterLoader.java:687(Compiled Code))
at com/wily/introscope/agent/trace/servlet/ServletParameterLoader.addParameters(ServletParameterLoader.java:625(Compiled Code))
at com/wily/introscope/agent/trace/servlet/ServletParameterLoader.doWithWrappers(ServletParameterLoader.java:384(Compiled Code))
at com/wily/introscope/agent/trace/servlet/ServletInvocationDataHelper$SafeGetServletWrappers.execute(ServletInvocationDataHelper.java:89(Compiled Code))
at com/wily/introscope/agent/trace/servlet/ServletInvocationDataHelper$SafeExecuteOnInvocationDataWithThrottling.executeSafe(ServletInvocationDataHelper.java:36(Compiled Code))
at com/wily/introscope/agent/trace/HttpServletTracer.IInvocationDataParameterCallback_addParameters(HttpServletTracer.java:551(Compiled Code))
at com/wily/introscope/agent/trace/InvocationData.IComponentParameterCallback_addParameters(InvocationData.java:580(Compiled Code))
at com/wily/introscope/agent/trace/BlamePointTracer$1.IComponentParameterCallback_addParameters(BlamePointTracer.java:196(Compiled Code))
at com/wily/introscope/agent/blamestackfeature/BlameStackFeatureStackEntry.createComponentEventData(BlameStackFeatureStackEntry.java:67(Compiled Code))
at com/wily/introscope/agent/blamestackfeature/BlameStackFeatureBlameStack.makeAndSendSnapshot(BlameStackFeatureBlameStack.java:281(Compiled Code))
at com/wily/introscope/agent/stalls/StallFeatureStackEntry.checkIfStalled(StallFeatureStackEntry.java:76(Compiled Code))
at com/wily/introscope/agent/stalls/StallFeature.checkIfStalled(StallFeature.java:179(Compiled Code))
at com/wily/introscope/agent/stalls/StallSweeper.sweepStacks(StallSweeper.java:125(Compiled Code))
at com/wily/introscope/agent/stalls/StallSweeper.ITimestampedRunnable_execute(StallSweeper.java:96(Compiled Code))
at com/wily/util/heartbeat/IntervalHeartbeat$BehaviorNode.execute(IntervalHeartbeat.java:944(Compiled Code))
at com/wily/util/heartbeat/IntervalHeartbeat.executeNextBehaviorAndCalculateSleepTime(IntervalHeartbeat.java:489(Compiled Code))
at com/wily/util/heartbeat/IntervalHeartbeat.access$2(IntervalHeartbeat.java:443(Compiled Code))
at com/wily/util/heartbeat/IntervalHeartbeat$HeartbeatRunnable.run(IntervalHeartbeat.java:665(Compiled Code))
at java/lang/Thread.run(Thread.java:773)

Waiting Threads:  1
WebConatiner :  6

Blocked by:  1
WebConatiner :  29


-->>  WebContainer : 29 thread is in progress with session invalidation after owning the ' MemorySession ' lock at com/ibm/ws/session/store/memory/MemorySession.invalidate(MemorySession.java:232(Compiled Code)).
But for logging the invalidation process, this thread needs ' root logger ' lock, which is owned by the " WebContainer : 6 " thread.
 
Thread Name
WebContainer : 29
State
Deadlock/Blocked

Monitor
Owns Monitor Lock on com/ibm/ws/session/store/memory/MemorySession@0x000000011EF1EF18
Waiting for Monitor Lock on org/apache/log4j/spi/RootLogger@0x0000000025878DC0

Java Stack
at org/apache/log4j/Category.callAppenders(Category.java:201(Compiled Code))
at org/apache/log4j/Category.forcedLog(Category.java:388(Compiled Code))
at org/apache/log4j/Category.info(Category.java:663(Compiled Code))
at com/applicationX/pocs/ui/web/util/SessionListener.sessionDestroyed(SessionListener.java:22(Compiled Code))
at com/ibm/ws/session/http/HttpSessionObserver.sessionDestroyed(HttpSessionObserver.java:179(Compiled Code))
at com/ibm/ws/session/SessionEventDispatcher.sessionDestroyed(SessionEventDispatcher.java:160(Compiled Code))
at com/ibm/ws/session/StoreCallback.sessionInvalidated(StoreCallback.java:126(Compiled Code))
at com/ibm/ws/session/store/memory/MemorySession.invalidate(MemorySession.java:232(Compiled Code))
at com/ibm/ws/session/http/HttpSessionImpl.invalidate(HttpSessionImpl.java:303(Compiled Code))
at com/ibm/ws/session/SessionData.invalidate(SessionData.java:247(Compiled Code))
at com/ibm/ws/session/HttpSessionFacade.invalidate(HttpSessionFacade.java:200(Compiled Code))
at com/applicationX/pocs/ui/web/controllers/common/ErrorPageController.display500Error(ErrorPageController.java:29)
at sun/reflect/GeneratedMethodAccessor3149.invoke(Bytecode PC:48)
at sun/reflect/DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37(Compiled Code))
at java/lang/reflect/Method.invoke(Method.java:611(Compiled Code))
at org/springframework/web/method/support/InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:219(Compiled Code))
at org/springframework/web/method/support/InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:132(Compiled Code))
at org/springframework/web/servlet/mvc/method/annotation/ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:100(Compiled Code))
at org/springframework/web/servlet/mvc/method/annotation/RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:604(Compiled Code))
at org/springframework/web/servlet/mvc/method/annotation/RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:565(Compiled Code))
at org/springframework/web/servlet/mvc/method/AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:80(Compiled Code))
at org/springframework/web/servlet/DispatcherServlet.doDispatch(DispatcherServlet.java:923(Compiled Code))
at org/springframework/web/servlet/DispatcherServlet.doService(DispatcherServlet.java:852(Compiled Code))
at org/springframework/web/servlet/FrameworkServlet.processRequest(FrameworkServlet.java:882(Compiled Code))
at org/springframework/web/servlet/FrameworkServlet.doGet(FrameworkServlet.java:778(Compiled Code))
at javax/servlet/http/HttpServlet.service(HttpServlet.java:575(Compiled Code))
at javax/servlet/http/HttpServlet.service(HttpServlet.java:668(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapper.service(ServletWrapper.java:1227(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapper.handleRequest(ServletWrapper.java:776(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapper.handleRequest(ServletWrapper.java:458(Compiled Code))
at com/ibm/ws/webcontainer/servlet/ServletWrapperImpl.handleRequest(ServletWrapperImpl.java:178(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.invokeTarget(WebAppFilterChain.java:136(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:97(Compiled Code))
at com/applicationX/pocs/ui/web/util/GZipServletFilter.doFilter(GZipServletFilter.java:50(Compiled Code))
at com/ibm/ws/webcontainer/filter/FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:195(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:91(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterManager.doFilter(WebAppFilterManager.java:928(Compiled Code))
at com/ibm/ws/webcontainer/filter/WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1025(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebAppRequestDispatcher.dispatch(WebAppRequestDispatcher.java:1385(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebAppRequestDispatcher.forward(WebAppRequestDispatcher.java:194(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebApp.sendError(WebApp.java:3263(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebApp.handleException(WebApp.java:3791)
at com/ibm/ws/webcontainer/webapp/WebApp.handleRequest(WebApp.java:3772(Compiled Code))
at com/ibm/ws/webcontainer/webapp/WebGroup.handleRequest(WebGroup.java:304(Compiled Code))
at com/ibm/ws/webcontainer/WebContainer.handleRequest(WebContainer.java:976(Compiled Code))
at com/ibm/ws/webcontainer/WSWebContainer.handleRequest(WSWebContainer.java:1662(Compiled Code))
at com/ibm/ws/webcontainer/channel/WCChannelLink.ready(WCChannelLink.java:200(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.handleDiscrimination(HttpInboundLink.java:459(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.handleNewRequest(HttpInboundLink.java:526(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.processRequest(HttpInboundLink.java:312(Compiled Code))
at com/ibm/ws/http/channel/inbound/impl/HttpInboundLink.ready(HttpInboundLink.java:283(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLConnectionLink.determineNextChannel(SSLConnectionLink.java:1048(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLConnectionLink.readyInboundPostHandshake(SSLConnectionLink.java:716(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLConnectionLink$MyHandshakeCompletedCallback.complete(SSLConnectionLink.java:412(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLUtils.handleHandshake(SSLUtils.java:1066(Compiled Code))
at com/ibm/ws/ssl/channel/impl/SSLHandshakeIOCallback.complete(SSLHandshakeIOCallback.java:87(Compiled Code))
at com/ibm/ws/tcp/channel/impl/AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:175(Compiled Code))
at com/ibm/io/async/AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217(Compiled Code))
at com/ibm/io/async/AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161(Compiled Code))
at com/ibm/io/async/AsyncFuture.completed(AsyncFuture.java:138(Compiled Code))
at com/ibm/io/async/ResultHandler.complete(ResultHandler.java:204(Compiled Code))
at com/ibm/io/async/ResultHandler.runEventProcessingLoop(ResultHandler.java:816(Compiled Code))
at com/ibm/io/async/ResultHandler$2.run(ResultHandler.java:905(Compiled Code))
at com/ibm/ws/util/ThreadPool$Worker.run(ThreadPool.java:1862(Compiled Code))

Waiting Threads:  1
Agent Heartbeat

Blocked Threads:  1
WebContainer :  6


Summary: 
These locks we have just seen above are raised from Wily Instroscope Agent, WebSphere Session Handler and applicationX, so needs further investigation by involving appropriate product support component teams and application developers to review their corresponding code.

References: 

IBM HTTP Server and SSL helpful links .

Here are some very useful IBM HTTP Server (IHS) SSL links: Documentation about setting up SSL virtualhosts, creating keyfiles, certificates, protecting access to directories and URLs to specific ciphers, tracing and recording SSL traffic, how to create a key database file and renewing a certificate using iKeyman, how to check the password expiration information of a KDB file, how to convert personal certificates stored in a PKCS12 or PFX file into a CMS key database file.
Also: configuring IHS with two different virtualhost definitions on SSL (port 443), IP-Based Virtual Hosting, SSL certificates with Subject Alternative Names (SAN), configuring IHS for multiple SSL certificates (multiple domains) using one IP address only and configuring IHS to require an SSL client certificate for certain requests, but not for others.
Please see below for further details:

Troubleshooting Native Memory Issues .

When it comes to troubleshooting native memory issues, typically where WebContainer transport is concerned, two troubleshooting methods are quite popular:


Both troubleshooting steps aim to avoid the large native memory footprint that may occur when using AIO or Asynchronous data transfer.  Much of the underlying native memory consumers in the WebContainer I/O are the directByteBuffers (DBBs) that host the I/O memory.  You can read about the troubleshooting methodology in the above links.  What I want to outline here are clarifications regarding AIO, NIO, DBBs, Synchronous and Asynchronous Transport that often arise when discussing the use of one or both of the troubleshooting methods:

  • Both AIO and NIO use direct byte buffers (DBB) to transport data.
  • Both AIO and NIO can be synchronous or asynchronous.
  • AIO keeps additional threads outside the threadpool listening for incoming requests (but return quickly if there is no incoming work).
  • NIO only keeps threads outside of the threadpool when there is work to complete.
  • AIO scales better than NIO due to a more efficient use of threads.
  • AIO additional threads consume more system resources (native memory, java memory, CPU) to manage these threads.
  • NIO uses less system resources than AIO at the cost of less scalability.
  • WebContainer custom property: com.ibm.ws.webcontainer.channelwritetype=sync means the WebContainer will use synchronous I/O when interfacing to the TCP Channel AIO or NIO implementation.
  • Synchronous I/O means the thread that made the I/O request will wait/block until returning to the caller with the final results of the request.
  • Synchronous I/O does not mean that AIO will be disabled.
  • In synchronous IO, there is no need to allocate additional DBB memory to store the data overflow as each write will wait for the previous write to complete.
  • DBB memory is stored in native memory and is released after use with explicit gc calls or on the next full gc.
  • In synchronous I/O, additional buffers are not created because the WebContainer waits for the OS to finish processing before continuing.
  • Asynchronous I/O behavior means that the calling thread may return before the read or write is complete, allowing for a thread to continue processing another event. A callback on a new thread will then be used to complete the write or read.
  • Asynchronous I/O may carry DBB native memory overhead if system I/O is unable to keep up with the workload.
  • If AIO DBB native memory is a concern, use WebContainer custom property: com.ibm.ws.webcontainer.channelwritetype=sync.

Anatomy of workspace/wstemp in WebSphere Application Server .

What is Workspace:
Whenever a user logs into the administrative console, or uses wsadmin scripting to make a configuration change, the changes are stored in the workspace. When a user uses the ConfigService configuration service interface of the Java application programming interfaces (APIs), the user specifies a session object that is associated with the workspace in order to store the changes. Only when the user performs a save operation under the administrative console, wsadmin scripting, or the Java APIs are the changes propagated and merged with the master configuration repository.

For each administrative console user or each invocation of wsadmin scripting, the application server creates a separate workspace directory to store the intermediate changes until the changes are merged with the master configuration repository. Users of the Java APIs use different session objects to decide where the workspace directory resides. Both the administrative console and wsadmin scripting generate user IDs randomly. The user IDs are different from the user IDs that you use to log into the administrative console or wsadmin scripting. The Java APIs can either randomly generate the user ID or specify the user ID as an option when creating the session object.

Workspace keeps track of context and file states whenever a caller does an add / delete / extract / update using Workspace APIs.

Default wstemp directory location:  %user.install.root%\wstemp\
where %user.install.root% is %WAS_HOME%\profiles\<profile name>\

Changeable by setting following JVM custom property:
              UNIX platform:
                           -Dwebsphere.workspace.root=/temp
              Windows platform:
                           -Dwebsphere.workspace.root=c:\temp

Change the JVM custom property through the administrative console by setting the JVM property as a name-value pair on the Custom properties page.
For example,
    Under Console > Click System Administration > Deployment manager > Java and Process Management > Process definition > Java Virtual Machine > Custom properties.

Workspace directory location
%user.install.root%\wstemp\<user id>\workspace\
%websphere.workspace.root%\wstemp\<user id>\workspace\


Workspace generates the following files to keep track of the status of workspace, context and file:
WorkSpace State
  •     “.workspace_”<session id>
                     Under %user.install.root%\wstemp\<user id>\workspace directory to distinguish which session of the user works on the workspace.
                     This file contains a runtime call stack information from WorkSpaceManagerImpl.createWorkSpace(Properties)
Context State
  •     ".repositoryContext
                     Under each <context type>\<context name> subdirectory of workspace, keep the status of the files being accessed.

File State
  •     <File Name>".copy": keep the digest of the file being extracted.
  •     <File Name>".current": keep the original copy of the file if merge is needed for this document. Only serverIndex.xml will be merged at this time.


Directory types generated under wstemp folder
  •     Script*: created by a wsadmin script;  
  •     anonymous*: created by client that pass a null workspace/session id.  
  •     -NNNNNNNNN: created by Admin Console  


It is the responsibility of the caller to remove the workspace session. It is better that the caller creates a session and then discards the session when finished. Each workspace is associated with a session object.

1. Session()
This one always creates a new unique session. The id of the session will be anonymous+current-time-in-millisecs. The following two lines will result in creation of a new unique workspace:
Session session = new Session();
configService.resolve(session, "Cell=");

So, every time a new session object is created this way and used, a new unique workspace directory will be created. Therefore, these workspaces should be removed by calling configService.discard(session) method or workspace.remove() method.

2. Session(id, resuse)
This one creates a new session based on the supplied id. If there is already a workspace created for that id, the same workspace will be used. However, it is not true if the reuse flag is set to false. If the reuse flag is set to false, a new session with id+current-time-in-millisecs is created.
  • The following two lines will result in creation of a new workspace if the corresponding workspace directory does not exist:
                    Session session = new Session("myWorkspace", true);
                    configService.resolve(session, "Cell=");  
    
                    The workspace must be removed by calling workspace.remove() or configservice.discard(session).  
  • The following two lines will result in creation of a new unique workspace 
                    Session session = new Session("myWorkspace", false);
                    configService.resolve(session, "Cell=");

Note: Every time a session object is created this way, it results in a new unique workspace. Therefore, all the workspaces created this way should be removed. If the caller code is not discarding the session it created, then the size of the wstemp folder will grow and eventually it can cause an OOM (out of memory) condition of dmgr or server depending on your setup.

One can turn on the following trace string either on deployment manager (or) on base application server depending on the type of setup to review the caller creating the workspace:

Current trace specification = *=info:com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl=all

With above trace string is enabled one can see following logging in the trace.log file:

In the example, I used the case "anonymous*: created by client that pass a null workspace/session id"

[5/9/14 9:00:15:241 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:241 NZST] 00000001 WorkSpaceMana 3   Call stack info of createWorkSpace(prop), workspace id anonymous1399582815240:        
<----- new workspace anonymous1399582815240 is created ----->
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getCallStack(WorkSpaceManagerImpl.java:591)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.createWorkSpace(WorkSpaceManagerImpl.java:148)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getWorkSpace(WorkSpaceManagerImpl.java:304)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getWorkSpace(WorkSpaceManagerImpl.java:241)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getWorkSpace(WorkSpaceManagerImpl.java:230)
    com.ibm.ws.management.configservice.WorkspaceHelper.getWorkspace(WorkspaceHelper.java:100)
    com.ibm.ws.management.configservice.WorkspaceHelper.getScopeContexts(WorkspaceHelper.java:344)
    com.ibm.ws.management.configservice.RootObjectDelegator.getAll(RootObjectDelegator.java:118)
    com.ibm.ws.management.configservice.ConfigServiceImpl.queryConfigObjects(ConfigServiceImpl.java:948)
    com.ibm.ws.management.configservice.ConfigServiceImpl.resolve(ConfigServiceImpl.java:1083)
    com.ibm.ws.management.configservice.ConfigServiceImpl.resolve(ConfigServiceImpl.java:1043)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.getCluster(ValidateDatabaseVersion.java:226)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.verifyStandardDB4ND(ValidateDatabaseVersion.java:199)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.start(ValidateDatabaseVersion.java:120)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ApplicationServerImpl.start(ApplicationServerImpl.java:252)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ServerImpl.start(ServerImpl.java:523)
    com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:310)
    com.ibm.ws.runtime.WsServerImpl.start(WsServerImpl.java:223)
    com.ibm.ws.runtime.WsServerImpl.main(WsServerImpl.java:686)
    com.ibm.ws.runtime.WsServer.main(WsServer.java:59)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)
    com.ibm.wsspi.bootstrap.WSLauncher.launchMain(WSLauncher.java:234)
    com.ibm.wsspi.bootstrap.WSLauncher.main(WSLauncher.java:96)
    com.ibm.wsspi.bootstrap.WSLauncher.run(WSLauncher.java:77)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)
    org.eclipse.equinox.internal.app.EclipseAppContainer.callMethodWithException(EclipseAppContainer.java:587)
    org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:198)
    org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
    org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
    org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369)
    org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)
    org.eclipse.core.launcher.Main.invokeFramework(Main.java:340)
    org.eclipse.core.launcher.Main.basicRun(Main.java:282)
    org.eclipse.core.launcher.Main.run(Main.java:981)
    com.ibm.wsspi.bootstrap.WSPreLauncher.launchEclipse(WSPreLauncher.java:380)
    com.ibm.wsspi.bootstrap.WSPreLauncher.main(WSPreLauncher.java:151)
[5/9/14 9:00:15:254 NZST] 00000001 WorkSpaceMana 3   Create workspace [WorkSpaceManagerImpl.createWorkSpace(prop)]...
     WorkspaceUserPath: /u01/app/dev/d4/bpm/node1/wstemp/anonymous1399582815240
     WorkspacePath ...: /u01/app/dev/d4/bpm/node1/wstemp/anonymous1399582815240/workspace
     repositoryAdapter: com.ibm.ws.sm.workspace.impl.WorkSpaceMasterRepositoryAdapter
[5/9/14 9:00:15:254 NZST] 00000001 WorkSpaceMana 3   getClassOfType, className: com.ibm.ws.sm.workspace.impl.WorkSpaceMasterRepositoryAdapter
[5/9/14 9:00:15:255 NZST] 00000001 WorkSpaceMana 3   profileKey
                                 <null>
[5/9/14 9:00:15:258 NZST] 00000001 WorkSpaceMana A   WKSP0500I: Workspace configuration consistency check is disabled.
[5/9/14 9:00:15:287 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:299 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:299 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:300 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:328 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:328 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:329 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:330 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:330 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:330 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:451 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:458 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:459 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:472 NZST] 00000001 WorkSpaceMana 3   removeWorkSpace, UserID: anonymous1399582815240, SessionID: nullCall stack info of removeWorkSpace(string, string): 
<----- discard() method is being called properly hence anonymous1399582815240 is removed at the end of session ----->
   com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getCallStack(WorkSpaceManagerImpl.java:591)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.removeWorkSpace(WorkSpaceManagerImpl.java:452)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.removeWorkSpace(WorkSpaceManagerImpl.java:446)
    com.ibm.ws.management.configservice.ConfigServiceImpl.discard(ConfigServiceImpl.java:789)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.getCluster(ValidateDatabaseVersion.java:242)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.verifyStandardDB4ND(ValidateDatabaseVersion.java:199)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.start(ValidateDatabaseVersion.java:120)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ApplicationServerImpl.start(ApplicationServerImpl.java:252)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ServerImpl.start(ServerImpl.java:523)
    com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:310)
    com.ibm.ws.runtime.WsServerImpl.start(WsServerImpl.java:223)
    com.ibm.ws.runtime.WsServerImpl.main(WsServerImpl.java:686)
    com.ibm.ws.runtime.WsServer.main(WsServer.java:59)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)

If the discard() method was not being called then one would not see "WorkSpaceMana 3   removeWorkSpace," in the trace.log file when you look up on workspace UserID: anonymous1399582815240.

Note: If there is no discard() method being called, then workspace session does not get cleaned up. So if the code is making a lot of ConfigService API calls, then the /profile_home/wstemp directory keeps increasing which will eventually cause an OOM (Out Of Memory) condition on the JVM (deployment manager or appserver). As a work-around we can stop the JVM and clean up the contents of the "/profile_home/wstemp" directory. For a permanent solution, one should be looking at the code to make sure a discard method is being called to clear the workspace.

Source: https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/anatomy_of_workspace_wstemp_websphere_application_server?lang=en

MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on Linux

If you are experiencing performance, hang, or high CPU issues with WebSphere Application Server on Linux, this MustGather will assist you in collecting the data necessary to diagnose and resolve the issue. There are two scripts that can be used to collect the performance diagnostic information Please expand the following section and download one of the scripts and use it to collect information during the problem

Collecting data


Complete the following three steps:

(1) Collecting the required data:

If you have not already done so, enable verboseGC and restart the problematic server(s).

At the time of the problem, run the attached script with the following command:

./linperf.sh [PID]
linperf.shlinperf.sh


This script will create a file named linperf_RESULTS.tar.gz and three javacores. This script should be executed as the root user. As with any script, you may need to add execute permissions before executing the script (chmod).

In the above command, [PID] is the Process ID of the problematic JVM(s). If specifying multiple Process IDs they should each be separated by a space.

(2) Collecting log files:

Collect the server logs (SystemOut.log, native_stderr.log,...) from the problematic server(s):

profile_root /logs/ server_name /*

(3) Submitting required data:

Zip/Tar all the files gathered:
  • linperf_RESULTS.tar.gz
  • javacores
  • server logs (SystemOut.log, native_stderr.log,...)

Send the results to IBM Support: "Exchanging information with IBM Support"




Frequently Asked Questions:

    What is the impact of enabling verboseGC?
    VerboseGC data is critical to diagnosing these issues. This can be enabled on production systems because it has a negligible impact on performance (< 2%).

    What is the linperf_RESULTS.tar.gz file and where can I find it?
    The linperf_RESULTS.tar.gz file is created while running the linperf.sh script and contains output from the commands called by the script. It will be created in the directory from which you execute the script.

    What are 'javacores' and where do I find them?
    Javacores are snapshots of the JVM activity and are essential to troubleshooting these issues. These files will usually be found in the <profile_root>. If you don't find the files here, you can search your entire system for them using the following command:

    find / -name "*javacore*"


If asked to do so:
The preceding data is used to troubleshoot most of these type of issues; however, in certain situations Support may require additional data. Only collect the following data if asked to do so by IBM Support.
    A series of system cores
      Collect a series of system cores by running the following commands:
      Note: These commands require the gdb debugger to be installed.

      gdb install_root/java/jre/bin/java [PID]
      (gdb) generate core1.[PID]
      (gdb) detach
      (gdb) quit

      <wait two minutes>

      gdb install_root/java/jre/bin/java [PID]
      (gdb) generate core2.[PID]
      (gdb) detach
      (gdb) quit

      <wait two minutes>

      gdb install_root/java/jre/bin/java [PID]
      (gdb) generate core3.[PID]
      (gdb) detach
      (gdb) quit

      Creating core files with gdb as described above should not kill the process. If the above 'generate' command does not work then try using the 'gcore' command.

      Process the resulting system core files (core1.[PID], core2.[PID],...) using the instructions in How to process a core dump using jextract on the IBM SDK on Windows, Linux, and AIX

    System core
      In cases where we cannot collect a series of system cores we can collect a single core file using the following method. However, it must be noted that this will kill the process when collected and the diagnostic value of a series of system cores is much greater than a single core when working with these type of issues. This single core file will need to be processed with jextract as noted in the above section.

      kill -6 [PID]

    Monitor process sizes and paging usage
      The linmon.sh script will collect data every 5 minutes until it is stopped manually. Run the following command before the issue occurs to start the script:

      ./linmon.sh

      This will create two files: ps_mon.out and vmstat_mon.out.

Exchanging data with IBM Support

To diagnose or identify a problem, it is sometimes necessary to provide Technical Support with data and information from your system. In addition, Technical Support might also need to provide you with tools or utilities to be used in problem determination. You can submit files using one of following methods to help speed problem diagnosis:

HMGR0152W: CPU Starvation detected messages in SystemOut.log

Problem(Abstract)

New system is working properly but HMGR warning messages are being logged in the SystemOut.log file.

Symptom

[10/25/05 16:42:27:635 EDT] 0000047a CoordinatorCo W HMGR0152W: CPU Starvation detected. Current thread scheduling delay is 9 seconds.

Cause

The HMGR0152W message is an indication that JVM thread scheduling delays are occurring for this process.

The WebSphere® Application Server high availability manager component contains thread scheduling delay detection logic, that periodically schedules a thread to run and tracks whether the thread was dispatched and run as scheduled. By default, a delay detection thread is scheduled to run every 30 seconds, and will log a HMGR0152W message if it is not run within 5 seconds of the expected schedule. The message will indicate the delay time or time differential between when the thread was expected to get the CPU, and when the thread actually got CPU cycles.

The HMGR0152W message can occur even when plenty of CPU resource is available. There are a number of reasons why the scheduled thread might not have been able to get the CPU in a timely fashion. Some common causes include the following:
  • The physical memory is overcommitted and paging is occurring.
  • The heap size for the process is too small causing garbage collection to run too frequently and/or too long, blocking execution of other threads.
  • There might simply be too many threads running in the system, and too much load placed on the machine, which might be indicated by high CPU utilization.


Resolving the problem

The HMGR0152W message is attempting to warn you that a condition is occurring that might lead to instability if it is not corrected. Analysis should be performed to understand why the thread scheduling delays are occurring, and what action(s) should be taken. Some common solutions include the following:
  • Adding more physical memory to prevent paging.
  • Tuning the JVM memory (heap size) for optimal garbage collection.
  • Reducing the overall system load to an acceptable value.

If the HMGR0152W messages do not occur very often, and indicate that the thread scheduling delay is relatively short (for example, < 20 seconds), it is likely that no other errors will occur and the message can safely be ignored.

The high availability manager thread scheduling delay detection is configurable by setting either of the following 2 custom properties.
  • IBM_CS_THREAD_SCHED_DETECT_PERIOD determines how often a delay detection thread is scheduled to run. The default value of this parameter is 30 (seconds).
  • IBM_CS_THREAD_SCHED_DETECT_ERROR determines how long of a delay should be tolerated before a warning message is logged. By default this value is 5 (seconds).

These properties are scoped to a core group and can be configured as follows:
  1. In the administrative console, click Servers > Core groups > Core groups settings and then select the core group name.

  2. Under Additional Properties, click Custom properties > New.

  3. Enter the property name and desired value.

  4. Save the changes.

  5. Restart the server for these changes to take effect.

While it is possible to use the custom properties mentioned above to increase the thread-scheduling-detect-period until the HMGR0152W warning messages no longer occur, this is not recommended. The proper solution is to tune the system to eliminate the thread scheduling delays.

Related information

MustGather: High CPU issues
Tuning operating systems
Tuning the Application Server Environment

CPU is starvated: How to feed my CPU.

Scarlet O'Hara once said "I'm going to live through this and when it's all over, I'll never be hungry again."  That was a story and era before computers. These days our computers can become starved, at least the Java (tm) virtual machine (JVM) can.  Performance is a key concern for everyone. When users have to wait, they are discouraged and either become distracted or they go somewhere else. Keeping a system running smoothly is key. Every now and then systems will have a slow spot. However, when it continuously impacts users, this issue must be investigated. For this article, I am focusing on the following example outputs seen in an JVM SystemOut.log.  These examples came from an IBM Business Process Manager SystemOut.log file.

HMGR0152W: CPU Starvation detected. Current thread scheduling delay is 23 seconds.
DCSV0004W: DCS Stack DefaultCoreGroup at Member PCCell01\PCNode01\BPM751PDEV.AppTarget.PCMNode01.0: Did not receive adequate CPU time slice. Last known CPU usage time at 12:23:55:452 CST. Inactivity duration was 31 seconds.

What does CPU Starvation mean?
CPU Starvation means that the JVM had to wait for processing time! Some other process took 100% of the CPU and the JVM did not work. Twenty-three seconds is a long time for a server to wait. In some examples, I have seen the wait time as high as 70 seconds.

Where is all the CPU time going?
There are two places to look. One is on the system itself. Is there a process on the operating system that has run away and is running at 99%?  A simple top command combined with kill -3 command on Linux operating systems or the Task Manger (2) on Windows operating systems can help. If there is another application on the server that became hung and is taking all the CPU time, investigate and stop the process.

If the operating system does not have any extra processes and you see CPU starvation, most likely the server is a guest operating system on a virtual environment. What this means is the larger virtual infrastructure does not have enough CPU time to give all of the virtual machines it controls. Contact your virtual machine provider or internal sysops team to start investigating the overall health of the virtual system. Other virtual machines in the environment might be using the system heavily and need to move to a different server.  Another option would be to dedicate CPU usage rather than sharing, which is default.  We have a document that offers links to other documents to consider when you are running J2EE applications and databases in a virtual environment.


Source: https://www.ibm.com/developerworks/community/blogs/WebSphere_Process_Server/entry/hungry_cpu?lang=en

How to diagnose error "SRVE0255E: A WebGroup/Virtual Host to handle {0} has not been defined"

Error SRVE0255E means that the webcontainer could not find a web group (web module) or virtual host to handle the request.
Here are the steps to diagnose this error:
1. Make sure that the URL entered at the browser is correct. Particularly, make sure that the context root from the URL matches the context root configured for the application.
2. Review the SystemOut.log to make sure that the application and server are started successfully and without any errors.
3. Verify that the application web module is mapped to the correct/intended virtual host. You can do this from the admin console by navigating to the following path:
Applications > Websphere enterprise applications > [app_name] > Virtual hosts
image

4. Under the virtual host that the application is mapped to (#3), make sure that there is a host alias definition for the host name and port number that this request is sent to. You can do this from the admin console by navigating to the following path:
Environment > Virtual hosts > [virtual_host_name] > Host aliases
image

5. Check the host alias definitions for other virtual hosts on this same server and make sure that there is no duplicate host alias definition with the same host name and port number. For example, if you have a host alias definition for host name www.example.com and port number 9080 under the virtual host default_host, you must NOT have a duplicate host alias definition, something like hostname * and port number 9080, under another virtual host such as custom_host.

How nodeagent monitors WebSphere Application Server.

This document explains how the monitoring policy works in WebSphere Application Server (WAS) and what is the recommended way to start the application server in parallel.
There are multiple ways to monitor the application server. For example, using JMX programming, using 3rd party tools, and using other sources. This document explains only how to use nodeagent monitoring for application server and monitoring servers using the Windows service.

1. When I reboot my machine, I want to start all the servers (including Dmgr and nodeagent) automatically. What is the recommended way to do to that?
Create wasservice for Deployment Manager server and nodeagent using the WASServiceCmd and set it to automatic. Don't enable Application Server.
Set the monitoring policy of application server to Running.
Nodeagent can monitor application server process. It can start the server if the server is down) during nodeagent startup or can restart the hung server or start the server when it goes down abnormally.
Note: Never create wasservice for the application server and set to automatic. During machine reboot the application server might try to get started before nodeagent starts. This can cause the server to fail to register with nodeagent LSD and fails to start. This is the only reason why we don't recommend the server to be started automatically by the operating server process like WASService. Review the dW Answer item "What is the recommended way to start WebSphere Application Server, Dmgr and nodeagent automatically?" for more information.

2. How can we start the application servers in parallel? In other words, can I start all application servers at the same time (not in sequence)?
Yes, you can do it using the com.ibm.websphere.management.nodeagent.bootstrap.maxthreadpool custom property.
Set the property under System Administration > Node agent > nodeagent_name > Java and process management > Process definition > Java virtual machine > Custom properties.
Use this property to control the number of threads that can be included in a newly created thread pool. A dedicated thread is created to start each application server Java virtual machine (JVM). The JVMs with dedicated threads in this thread pool are the JVMs that are started in parallel whenever the node agent starts.
You can specify an integer from 0 - 5 as the value for this property. If the value you specify is greater than 0, a thread pool is created with that value as the maximum number of threads that can be included in this newly created thread pool. The following table lists the supported values for this custom property and their effect.

Property threadpool.maxsize is set to 0 or not specified - The node agent starts up to five JVMs in parallel.
Property threadpool.maxsize is set to 1                  - The node agent starts the JVMs serially.
Property threadpool.maxsize value between 2 and 5        - The node agent starts a number of JVMs equal to the specified value in parallel.
Note: With this property you can only start a maximum of 5 servers at a time.

3. Why does logging off of Windows system crash WebSphere Application Server?
You should never logoff the system when the windows service is not created for the process. It won't provide any footprint about the crash. It's almost like killing the java process from the Task Manager. When you logoff it kills the server process. It can be Dmgr, nodeagent, or application server process. You must create the windows service for the server process if you want the process to be running when you logoff the windows service user.
To create windows service using wasservicehelper, see the topic "Using the WASServiceHelper utility to create Windows services for application servers" in the product documentation.
Notes:
  1. If you want the process to be running (without the windows service), lock the computer, don't logoff.
  2. If you restart the system, the process will be killed irrespective of windows service.
  3. It's recommended not to create windows service for the application servers. Nodeagent should be monitoring the application server using the monitoring policy. For more information, please refer to the product documentation.

4. How does nodeagent monitor the application server and how does it know the previous state of the application server?
When the nodeagent monitors the application server (with the monitoring policy created as mentioned in question 1) it saves the server state information in the monitoring.state file. It will maintain the previous server state and the application server PID. In case of an application server crash or hang, the nodeagent will get the previous state of the server from the monitoring.state file and then try to start the application server automatically.
Note: If you notice StringIndexOutOfBoundsException or any other exception in the NodeAgent.loadNodeState stack (nodeagent Systemout.log file), it means the monitored.state file is corrupted. You must stop all servers, delete the file and then start the nodeagent again. For example:

    Caused by: java.lang.StringIndexOutOfBoundsException
    at java.lang.String.substring(String.java:1115)
    at com.ibm.ws.management.nodeagent.NodeAgent.loadNodeState(NodeAgent .java:3210)

5. My application servers were monitored by the nodeagent. When the server was hung, why didn't the nodeagent restart the server?
Before I answer this question, please review the product documentation section "Monitoring policy settings" that explains how the monitoring policy works in WAS.
Nodeagent PidWaiter sends the signal every ping time out interval to get the status of the application server. If the PidWaiter does not get the response back from Application Server then AppServer is considered hung. Once the application server is identified as unresponsive/hung the nodeagent PidWaiter sends a SIGTERM to the process, which does not guarantee the process is immediately stopped. It sends the signal wait for the process to normally shutdown. If the server doesn't respond to any request, the server just stays hung forever.
If you want the server to be killed when it's hung or doesn't respond to the nodeagent ping, then you need to set "com.ibm.server.allow.sigkill" property to true in the nodeagent custom property. Please review section "Java virtual machine settings" in the product documentation for more information.


source: https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/Recommended_Maximum_Heap_Sizes_on_32_and_64_bit_WebSphere_Java_instances?lang=en

Recommended Maximum Heap Sizes on 32 and 64 bit WebSphere Java instances

One of the most common questions asked in WebSphere Java support is, "What is the recommended Maximum Heap size?"
One of the most common problems reported in WebSphere Java support is, "OutofMemory" condition (Java or Native).
 
This blog is simply a starting point or general reference based upon daily observations in the technical support arena, it is not intended to be a solution for every situation, but moreover a general set of starting recommendations.
Ideally, you will need to test appropriate values in your own environments and topologies, based upon application architecture, number of applications running, how busy the AppServer is and underlying load conditions, how much physical memory or RAM is installed and running, how many JVMs are being hosted and what other additional Java and native memory processes are running on the same machine.
 
For 32 bit platforms and Java stacks in general, the recommended Maximum Heap range for WebSphere Application Server (WAS), would be between (1024M - 1536M) or (1G - 1.5G); higher values will most likely eventually result in Native Memory contention. For Node Agents and Deployment Manager, depending upon how many nodes are managed serviced and how many application deployments occur, you can probably utilize less heap memory, between (512M - 1024M) or (.5G - 1G) may suffice. *But the Default out-of-the-box configuration value of 256M is most likely too low a value in most use-case scenarios.
*Remember that the WAS Java process shares a 4 Gigabyte memory address space with the OS in accordance with 32 bit design specification (User Virtual Memory is 2G).
Application Server       1024M - 1536M                 
Deployment Manager    512M - 1024M
Node Agent                    512M - 1024M
 
For 64 bit platforms and Java stacks in general, the recommended Maximum Heap range for WebSphere Application Server, would be between (4096M - 8192M) or (4G - 8G). For Node Agents and Deployment Manager, depending upon how many nodes are managed serviced and how many application deployments occur, you can probably utilize less heap memory, between (2048M - 4096M) or (2G - 4G).
*Remember that the WAS Java process shares a 16 Terabyte memory address space with the OS in accordance with 64 bit design specification (User Virtual Memory is 8T).
Application Server       4096M - 8192M                 
Deployment Manager  2048M - 4096M
Node Agent                  2048M - 4096M
 
Now regarding the Minimum Heap value, we have found that when using the newer product versions WAS v8.x and v9.x with default GC Policy of GENCON, setting a 'Fixed Heap' works and performs best (Maximum Heap Size = Minimum Heap Size) as well as a 'Fixed Nursery'. You can fix the nursery size with a Generic JVM argument of -Xmn####m (example: -Xmn1024m for a 1Gb nursery region). Without -Xmn, the nursery region defaults to approximately 25% of the max heap size, but it is Variable and not Fixed. The concept of the GENCON GC Policy resizing heap regions was to keep them smaller for faster GC Cycles, but we have found out that in practice the overhead of this resizing often makes GC very inefficient. To navigate to the right area to set -Xmn, see "Setting generic JVM arguments in WebSphere Application Server."
 
The Heap values can be set any number of ways depending upon the actual product version of WebSphere, but typically from the Admin Console JVM process settings, from the Generic JVM Args (-Xmx -Xms), WSAdmin command-line interface, Startup and Deployment scripts, manual server.xml modification (not recommended) and so forth; more details and step-by-step instructions can be found in the corresponding WebSphere product documentation based upon product version, as well as related developWorks articles and Blog entries.
 
*Please also keep in mind that the overall WAS JVM Process Size or Memory Footprint will typically be larger than Maximum Heap size (upwards of 1.5x), simply because it includes not only the Java Heap, but also underlying Classes and Jars, Threads and Stacks, Monitors, generated JIT code, malloc'd JNI Native Memory calls and so forth. For example:
  • -Xmx4G (process size could be around 6G on a busy AppServer)
  • -Xmx8G (process size could be around 12G on a busy AppServer)
 
Caveat: This blog entry was primarily created for WebSphere Base and Network Deployment full version products, but I wanted to also quickly point out that when using some of our stack products such as Business Process Monitor (BPM), eXtreme Scale, Portal, Process Server, the Maximum Heap sizes or ranges may need to be a bit larger than what I specified above.