Wikipedia

Search results

Thursday, 13 September 2018

Anatomy of workspace/wstemp in WebSphere Application Server .

What is Workspace:
Whenever a user logs into the administrative console, or uses wsadmin scripting to make a configuration change, the changes are stored in the workspace. When a user uses the ConfigService configuration service interface of the Java application programming interfaces (APIs), the user specifies a session object that is associated with the workspace in order to store the changes. Only when the user performs a save operation under the administrative console, wsadmin scripting, or the Java APIs are the changes propagated and merged with the master configuration repository.

For each administrative console user or each invocation of wsadmin scripting, the application server creates a separate workspace directory to store the intermediate changes until the changes are merged with the master configuration repository. Users of the Java APIs use different session objects to decide where the workspace directory resides. Both the administrative console and wsadmin scripting generate user IDs randomly. The user IDs are different from the user IDs that you use to log into the administrative console or wsadmin scripting. The Java APIs can either randomly generate the user ID or specify the user ID as an option when creating the session object.

Workspace keeps track of context and file states whenever a caller does an add / delete / extract / update using Workspace APIs.

Default wstemp directory location:  %user.install.root%\wstemp\
where %user.install.root% is %WAS_HOME%\profiles\<profile name>\

Changeable by setting following JVM custom property:
              UNIX platform:
                           -Dwebsphere.workspace.root=/temp
              Windows platform:
                           -Dwebsphere.workspace.root=c:\temp

Change the JVM custom property through the administrative console by setting the JVM property as a name-value pair on the Custom properties page.
For example,
    Under Console > Click System Administration > Deployment manager > Java and Process Management > Process definition > Java Virtual Machine > Custom properties.

Workspace directory location
%user.install.root%\wstemp\<user id>\workspace\
%websphere.workspace.root%\wstemp\<user id>\workspace\


Workspace generates the following files to keep track of the status of workspace, context and file:
WorkSpace State
  •     “.workspace_”<session id>
                     Under %user.install.root%\wstemp\<user id>\workspace directory to distinguish which session of the user works on the workspace.
                     This file contains a runtime call stack information from WorkSpaceManagerImpl.createWorkSpace(Properties)
Context State
  •     ".repositoryContext
                     Under each <context type>\<context name> subdirectory of workspace, keep the status of the files being accessed.

File State
  •     <File Name>".copy": keep the digest of the file being extracted.
  •     <File Name>".current": keep the original copy of the file if merge is needed for this document. Only serverIndex.xml will be merged at this time.


Directory types generated under wstemp folder
  •     Script*: created by a wsadmin script;  
  •     anonymous*: created by client that pass a null workspace/session id.  
  •     -NNNNNNNNN: created by Admin Console  


It is the responsibility of the caller to remove the workspace session. It is better that the caller creates a session and then discards the session when finished. Each workspace is associated with a session object.

1. Session()
This one always creates a new unique session. The id of the session will be anonymous+current-time-in-millisecs. The following two lines will result in creation of a new unique workspace:
Session session = new Session();
configService.resolve(session, "Cell=");

So, every time a new session object is created this way and used, a new unique workspace directory will be created. Therefore, these workspaces should be removed by calling configService.discard(session) method or workspace.remove() method.

2. Session(id, resuse)
This one creates a new session based on the supplied id. If there is already a workspace created for that id, the same workspace will be used. However, it is not true if the reuse flag is set to false. If the reuse flag is set to false, a new session with id+current-time-in-millisecs is created.
  • The following two lines will result in creation of a new workspace if the corresponding workspace directory does not exist:
                    Session session = new Session("myWorkspace", true);
                    configService.resolve(session, "Cell=");  
    
                    The workspace must be removed by calling workspace.remove() or configservice.discard(session).  
  • The following two lines will result in creation of a new unique workspace 
                    Session session = new Session("myWorkspace", false);
                    configService.resolve(session, "Cell=");

Note: Every time a session object is created this way, it results in a new unique workspace. Therefore, all the workspaces created this way should be removed. If the caller code is not discarding the session it created, then the size of the wstemp folder will grow and eventually it can cause an OOM (out of memory) condition of dmgr or server depending on your setup.

One can turn on the following trace string either on deployment manager (or) on base application server depending on the type of setup to review the caller creating the workspace:

Current trace specification = *=info:com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl=all

With above trace string is enabled one can see following logging in the trace.log file:

In the example, I used the case "anonymous*: created by client that pass a null workspace/session id"

[5/9/14 9:00:15:241 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:241 NZST] 00000001 WorkSpaceMana 3   Call stack info of createWorkSpace(prop), workspace id anonymous1399582815240:        
<----- new workspace anonymous1399582815240 is created ----->
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getCallStack(WorkSpaceManagerImpl.java:591)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.createWorkSpace(WorkSpaceManagerImpl.java:148)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getWorkSpace(WorkSpaceManagerImpl.java:304)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getWorkSpace(WorkSpaceManagerImpl.java:241)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getWorkSpace(WorkSpaceManagerImpl.java:230)
    com.ibm.ws.management.configservice.WorkspaceHelper.getWorkspace(WorkspaceHelper.java:100)
    com.ibm.ws.management.configservice.WorkspaceHelper.getScopeContexts(WorkspaceHelper.java:344)
    com.ibm.ws.management.configservice.RootObjectDelegator.getAll(RootObjectDelegator.java:118)
    com.ibm.ws.management.configservice.ConfigServiceImpl.queryConfigObjects(ConfigServiceImpl.java:948)
    com.ibm.ws.management.configservice.ConfigServiceImpl.resolve(ConfigServiceImpl.java:1083)
    com.ibm.ws.management.configservice.ConfigServiceImpl.resolve(ConfigServiceImpl.java:1043)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.getCluster(ValidateDatabaseVersion.java:226)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.verifyStandardDB4ND(ValidateDatabaseVersion.java:199)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.start(ValidateDatabaseVersion.java:120)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ApplicationServerImpl.start(ApplicationServerImpl.java:252)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ServerImpl.start(ServerImpl.java:523)
    com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:310)
    com.ibm.ws.runtime.WsServerImpl.start(WsServerImpl.java:223)
    com.ibm.ws.runtime.WsServerImpl.main(WsServerImpl.java:686)
    com.ibm.ws.runtime.WsServer.main(WsServer.java:59)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)
    com.ibm.wsspi.bootstrap.WSLauncher.launchMain(WSLauncher.java:234)
    com.ibm.wsspi.bootstrap.WSLauncher.main(WSLauncher.java:96)
    com.ibm.wsspi.bootstrap.WSLauncher.run(WSLauncher.java:77)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)
    org.eclipse.equinox.internal.app.EclipseAppContainer.callMethodWithException(EclipseAppContainer.java:587)
    org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:198)
    org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:110)
    org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:79)
    org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:369)
    org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)
    org.eclipse.core.launcher.Main.invokeFramework(Main.java:340)
    org.eclipse.core.launcher.Main.basicRun(Main.java:282)
    org.eclipse.core.launcher.Main.run(Main.java:981)
    com.ibm.wsspi.bootstrap.WSPreLauncher.launchEclipse(WSPreLauncher.java:380)
    com.ibm.wsspi.bootstrap.WSPreLauncher.main(WSPreLauncher.java:151)
[5/9/14 9:00:15:254 NZST] 00000001 WorkSpaceMana 3   Create workspace [WorkSpaceManagerImpl.createWorkSpace(prop)]...
     WorkspaceUserPath: /u01/app/dev/d4/bpm/node1/wstemp/anonymous1399582815240
     WorkspacePath ...: /u01/app/dev/d4/bpm/node1/wstemp/anonymous1399582815240/workspace
     repositoryAdapter: com.ibm.ws.sm.workspace.impl.WorkSpaceMasterRepositoryAdapter
[5/9/14 9:00:15:254 NZST] 00000001 WorkSpaceMana 3   getClassOfType, className: com.ibm.ws.sm.workspace.impl.WorkSpaceMasterRepositoryAdapter
[5/9/14 9:00:15:255 NZST] 00000001 WorkSpaceMana 3   profileKey
                                 <null>
[5/9/14 9:00:15:258 NZST] 00000001 WorkSpaceMana A   WKSP0500I: Workspace configuration consistency check is disabled.
[5/9/14 9:00:15:287 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:299 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:299 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:300 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:328 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:328 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:329 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:330 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:330 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:330 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:451 NZST] 00000001 WorkSpaceMana >  getWorkSpace, UserID: anonymous1399582815240, SessionID: null, create workspace if not found: true Entry
[5/9/14 9:00:15:458 NZST] 00000001 WorkSpaceMana 3   The same sessionId share the same WorkSpace.
[5/9/14 9:00:15:459 NZST] 00000001 WorkSpaceMana <  getWorkSpace, WS: com.ibm.ws.sm.workspace.impl.WorkSpaceImpl@e332d7a3 Exit
[5/9/14 9:00:15:472 NZST] 00000001 WorkSpaceMana 3   removeWorkSpace, UserID: anonymous1399582815240, SessionID: nullCall stack info of removeWorkSpace(string, string): 
<----- discard() method is being called properly hence anonymous1399582815240 is removed at the end of session ----->
   com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.getCallStack(WorkSpaceManagerImpl.java:591)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.removeWorkSpace(WorkSpaceManagerImpl.java:452)
    com.ibm.ws.sm.workspace.impl.WorkSpaceManagerImpl.removeWorkSpace(WorkSpaceManagerImpl.java:446)
    com.ibm.ws.management.configservice.ConfigServiceImpl.discard(ConfigServiceImpl.java:789)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.getCluster(ValidateDatabaseVersion.java:242)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.verifyStandardDB4ND(ValidateDatabaseVersion.java:199)
    com.ibm.bpmcommon.upgrade.database.ValidateDatabaseVersion.start(ValidateDatabaseVersion.java:120)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ApplicationServerImpl.start(ApplicationServerImpl.java:252)
    com.ibm.ws.runtime.component.ContainerHelper.startComponents(ContainerHelper.java:539)
    com.ibm.ws.runtime.component.ContainerImpl.startComponents(ContainerImpl.java:627)
    com.ibm.ws.runtime.component.ContainerImpl.start(ContainerImpl.java:618)
    com.ibm.ws.runtime.component.ServerImpl.start(ServerImpl.java:523)
    com.ibm.ws.runtime.WsServerImpl.bootServerContainer(WsServerImpl.java:310)
    com.ibm.ws.runtime.WsServerImpl.start(WsServerImpl.java:223)
    com.ibm.ws.runtime.WsServerImpl.main(WsServerImpl.java:686)
    com.ibm.ws.runtime.WsServer.main(WsServer.java:59)
    sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
    java.lang.reflect.Method.invoke(Method.java:611)

If the discard() method was not being called then one would not see "WorkSpaceMana 3   removeWorkSpace," in the trace.log file when you look up on workspace UserID: anonymous1399582815240.

Note: If there is no discard() method being called, then workspace session does not get cleaned up. So if the code is making a lot of ConfigService API calls, then the /profile_home/wstemp directory keeps increasing which will eventually cause an OOM (Out Of Memory) condition on the JVM (deployment manager or appserver). As a work-around we can stop the JVM and clean up the contents of the "/profile_home/wstemp" directory. For a permanent solution, one should be looking at the code to make sure a discard method is being called to clear the workspace.

Source: https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/anatomy_of_workspace_wstemp_websphere_application_server?lang=en

MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on Linux

If you are experiencing performance, hang, or high CPU issues with WebSphere Application Server on Linux, this MustGather will assist you in collecting the data necessary to diagnose and resolve the issue. There are two scripts that can be used to collect the performance diagnostic information Please expand the following section and download one of the scripts and use it to collect information during the problem

Collecting data


Complete the following three steps:

(1) Collecting the required data:

If you have not already done so, enable verboseGC and restart the problematic server(s).

At the time of the problem, run the attached script with the following command:

./linperf.sh [PID]
linperf.shlinperf.sh


This script will create a file named linperf_RESULTS.tar.gz and three javacores. This script should be executed as the root user. As with any script, you may need to add execute permissions before executing the script (chmod).

In the above command, [PID] is the Process ID of the problematic JVM(s). If specifying multiple Process IDs they should each be separated by a space.

(2) Collecting log files:

Collect the server logs (SystemOut.log, native_stderr.log,...) from the problematic server(s):

profile_root /logs/ server_name /*

(3) Submitting required data:

Zip/Tar all the files gathered:
  • linperf_RESULTS.tar.gz
  • javacores
  • server logs (SystemOut.log, native_stderr.log,...)

Send the results to IBM Support: "Exchanging information with IBM Support"




Frequently Asked Questions:

    What is the impact of enabling verboseGC?
    VerboseGC data is critical to diagnosing these issues. This can be enabled on production systems because it has a negligible impact on performance (< 2%).

    What is the linperf_RESULTS.tar.gz file and where can I find it?
    The linperf_RESULTS.tar.gz file is created while running the linperf.sh script and contains output from the commands called by the script. It will be created in the directory from which you execute the script.

    What are 'javacores' and where do I find them?
    Javacores are snapshots of the JVM activity and are essential to troubleshooting these issues. These files will usually be found in the <profile_root>. If you don't find the files here, you can search your entire system for them using the following command:

    find / -name "*javacore*"


If asked to do so:
The preceding data is used to troubleshoot most of these type of issues; however, in certain situations Support may require additional data. Only collect the following data if asked to do so by IBM Support.
    A series of system cores
      Collect a series of system cores by running the following commands:
      Note: These commands require the gdb debugger to be installed.

      gdb install_root/java/jre/bin/java [PID]
      (gdb) generate core1.[PID]
      (gdb) detach
      (gdb) quit

      <wait two minutes>

      gdb install_root/java/jre/bin/java [PID]
      (gdb) generate core2.[PID]
      (gdb) detach
      (gdb) quit

      <wait two minutes>

      gdb install_root/java/jre/bin/java [PID]
      (gdb) generate core3.[PID]
      (gdb) detach
      (gdb) quit

      Creating core files with gdb as described above should not kill the process. If the above 'generate' command does not work then try using the 'gcore' command.

      Process the resulting system core files (core1.[PID], core2.[PID],...) using the instructions in How to process a core dump using jextract on the IBM SDK on Windows, Linux, and AIX

    System core
      In cases where we cannot collect a series of system cores we can collect a single core file using the following method. However, it must be noted that this will kill the process when collected and the diagnostic value of a series of system cores is much greater than a single core when working with these type of issues. This single core file will need to be processed with jextract as noted in the above section.

      kill -6 [PID]

    Monitor process sizes and paging usage
      The linmon.sh script will collect data every 5 minutes until it is stopped manually. Run the following command before the issue occurs to start the script:

      ./linmon.sh

      This will create two files: ps_mon.out and vmstat_mon.out.

Exchanging data with IBM Support

To diagnose or identify a problem, it is sometimes necessary to provide Technical Support with data and information from your system. In addition, Technical Support might also need to provide you with tools or utilities to be used in problem determination. You can submit files using one of following methods to help speed problem diagnosis:

HMGR0152W: CPU Starvation detected messages in SystemOut.log

Problem(Abstract)

New system is working properly but HMGR warning messages are being logged in the SystemOut.log file.

Symptom

[10/25/05 16:42:27:635 EDT] 0000047a CoordinatorCo W HMGR0152W: CPU Starvation detected. Current thread scheduling delay is 9 seconds.

Cause

The HMGR0152W message is an indication that JVM thread scheduling delays are occurring for this process.

The WebSphere® Application Server high availability manager component contains thread scheduling delay detection logic, that periodically schedules a thread to run and tracks whether the thread was dispatched and run as scheduled. By default, a delay detection thread is scheduled to run every 30 seconds, and will log a HMGR0152W message if it is not run within 5 seconds of the expected schedule. The message will indicate the delay time or time differential between when the thread was expected to get the CPU, and when the thread actually got CPU cycles.

The HMGR0152W message can occur even when plenty of CPU resource is available. There are a number of reasons why the scheduled thread might not have been able to get the CPU in a timely fashion. Some common causes include the following:
  • The physical memory is overcommitted and paging is occurring.
  • The heap size for the process is too small causing garbage collection to run too frequently and/or too long, blocking execution of other threads.
  • There might simply be too many threads running in the system, and too much load placed on the machine, which might be indicated by high CPU utilization.


Resolving the problem

The HMGR0152W message is attempting to warn you that a condition is occurring that might lead to instability if it is not corrected. Analysis should be performed to understand why the thread scheduling delays are occurring, and what action(s) should be taken. Some common solutions include the following:
  • Adding more physical memory to prevent paging.
  • Tuning the JVM memory (heap size) for optimal garbage collection.
  • Reducing the overall system load to an acceptable value.

If the HMGR0152W messages do not occur very often, and indicate that the thread scheduling delay is relatively short (for example, < 20 seconds), it is likely that no other errors will occur and the message can safely be ignored.

The high availability manager thread scheduling delay detection is configurable by setting either of the following 2 custom properties.
  • IBM_CS_THREAD_SCHED_DETECT_PERIOD determines how often a delay detection thread is scheduled to run. The default value of this parameter is 30 (seconds).
  • IBM_CS_THREAD_SCHED_DETECT_ERROR determines how long of a delay should be tolerated before a warning message is logged. By default this value is 5 (seconds).

These properties are scoped to a core group and can be configured as follows:
  1. In the administrative console, click Servers > Core groups > Core groups settings and then select the core group name.

  2. Under Additional Properties, click Custom properties > New.

  3. Enter the property name and desired value.

  4. Save the changes.

  5. Restart the server for these changes to take effect.

While it is possible to use the custom properties mentioned above to increase the thread-scheduling-detect-period until the HMGR0152W warning messages no longer occur, this is not recommended. The proper solution is to tune the system to eliminate the thread scheduling delays.

Related information

MustGather: High CPU issues
Tuning operating systems
Tuning the Application Server Environment

CPU is starvated: How to feed my CPU.

Scarlet O'Hara once said "I'm going to live through this and when it's all over, I'll never be hungry again."  That was a story and era before computers. These days our computers can become starved, at least the Java (tm) virtual machine (JVM) can.  Performance is a key concern for everyone. When users have to wait, they are discouraged and either become distracted or they go somewhere else. Keeping a system running smoothly is key. Every now and then systems will have a slow spot. However, when it continuously impacts users, this issue must be investigated. For this article, I am focusing on the following example outputs seen in an JVM SystemOut.log.  These examples came from an IBM Business Process Manager SystemOut.log file.

HMGR0152W: CPU Starvation detected. Current thread scheduling delay is 23 seconds.
DCSV0004W: DCS Stack DefaultCoreGroup at Member PCCell01\PCNode01\BPM751PDEV.AppTarget.PCMNode01.0: Did not receive adequate CPU time slice. Last known CPU usage time at 12:23:55:452 CST. Inactivity duration was 31 seconds.

What does CPU Starvation mean?
CPU Starvation means that the JVM had to wait for processing time! Some other process took 100% of the CPU and the JVM did not work. Twenty-three seconds is a long time for a server to wait. In some examples, I have seen the wait time as high as 70 seconds.

Where is all the CPU time going?
There are two places to look. One is on the system itself. Is there a process on the operating system that has run away and is running at 99%?  A simple top command combined with kill -3 command on Linux operating systems or the Task Manger (2) on Windows operating systems can help. If there is another application on the server that became hung and is taking all the CPU time, investigate and stop the process.

If the operating system does not have any extra processes and you see CPU starvation, most likely the server is a guest operating system on a virtual environment. What this means is the larger virtual infrastructure does not have enough CPU time to give all of the virtual machines it controls. Contact your virtual machine provider or internal sysops team to start investigating the overall health of the virtual system. Other virtual machines in the environment might be using the system heavily and need to move to a different server.  Another option would be to dedicate CPU usage rather than sharing, which is default.  We have a document that offers links to other documents to consider when you are running J2EE applications and databases in a virtual environment.


Source: https://www.ibm.com/developerworks/community/blogs/WebSphere_Process_Server/entry/hungry_cpu?lang=en

How to diagnose error "SRVE0255E: A WebGroup/Virtual Host to handle {0} has not been defined"

Error SRVE0255E means that the webcontainer could not find a web group (web module) or virtual host to handle the request.
Here are the steps to diagnose this error:
1. Make sure that the URL entered at the browser is correct. Particularly, make sure that the context root from the URL matches the context root configured for the application.
2. Review the SystemOut.log to make sure that the application and server are started successfully and without any errors.
3. Verify that the application web module is mapped to the correct/intended virtual host. You can do this from the admin console by navigating to the following path:
Applications > Websphere enterprise applications > [app_name] > Virtual hosts
image

4. Under the virtual host that the application is mapped to (#3), make sure that there is a host alias definition for the host name and port number that this request is sent to. You can do this from the admin console by navigating to the following path:
Environment > Virtual hosts > [virtual_host_name] > Host aliases
image

5. Check the host alias definitions for other virtual hosts on this same server and make sure that there is no duplicate host alias definition with the same host name and port number. For example, if you have a host alias definition for host name www.example.com and port number 9080 under the virtual host default_host, you must NOT have a duplicate host alias definition, something like hostname * and port number 9080, under another virtual host such as custom_host.

How nodeagent monitors WebSphere Application Server.

This document explains how the monitoring policy works in WebSphere Application Server (WAS) and what is the recommended way to start the application server in parallel.
There are multiple ways to monitor the application server. For example, using JMX programming, using 3rd party tools, and using other sources. This document explains only how to use nodeagent monitoring for application server and monitoring servers using the Windows service.

1. When I reboot my machine, I want to start all the servers (including Dmgr and nodeagent) automatically. What is the recommended way to do to that?
Create wasservice for Deployment Manager server and nodeagent using the WASServiceCmd and set it to automatic. Don't enable Application Server.
Set the monitoring policy of application server to Running.
Nodeagent can monitor application server process. It can start the server if the server is down) during nodeagent startup or can restart the hung server or start the server when it goes down abnormally.
Note: Never create wasservice for the application server and set to automatic. During machine reboot the application server might try to get started before nodeagent starts. This can cause the server to fail to register with nodeagent LSD and fails to start. This is the only reason why we don't recommend the server to be started automatically by the operating server process like WASService. Review the dW Answer item "What is the recommended way to start WebSphere Application Server, Dmgr and nodeagent automatically?" for more information.

2. How can we start the application servers in parallel? In other words, can I start all application servers at the same time (not in sequence)?
Yes, you can do it using the com.ibm.websphere.management.nodeagent.bootstrap.maxthreadpool custom property.
Set the property under System Administration > Node agent > nodeagent_name > Java and process management > Process definition > Java virtual machine > Custom properties.
Use this property to control the number of threads that can be included in a newly created thread pool. A dedicated thread is created to start each application server Java virtual machine (JVM). The JVMs with dedicated threads in this thread pool are the JVMs that are started in parallel whenever the node agent starts.
You can specify an integer from 0 - 5 as the value for this property. If the value you specify is greater than 0, a thread pool is created with that value as the maximum number of threads that can be included in this newly created thread pool. The following table lists the supported values for this custom property and their effect.

Property threadpool.maxsize is set to 0 or not specified - The node agent starts up to five JVMs in parallel.
Property threadpool.maxsize is set to 1                  - The node agent starts the JVMs serially.
Property threadpool.maxsize value between 2 and 5        - The node agent starts a number of JVMs equal to the specified value in parallel.
Note: With this property you can only start a maximum of 5 servers at a time.

3. Why does logging off of Windows system crash WebSphere Application Server?
You should never logoff the system when the windows service is not created for the process. It won't provide any footprint about the crash. It's almost like killing the java process from the Task Manager. When you logoff it kills the server process. It can be Dmgr, nodeagent, or application server process. You must create the windows service for the server process if you want the process to be running when you logoff the windows service user.
To create windows service using wasservicehelper, see the topic "Using the WASServiceHelper utility to create Windows services for application servers" in the product documentation.
Notes:
  1. If you want the process to be running (without the windows service), lock the computer, don't logoff.
  2. If you restart the system, the process will be killed irrespective of windows service.
  3. It's recommended not to create windows service for the application servers. Nodeagent should be monitoring the application server using the monitoring policy. For more information, please refer to the product documentation.

4. How does nodeagent monitor the application server and how does it know the previous state of the application server?
When the nodeagent monitors the application server (with the monitoring policy created as mentioned in question 1) it saves the server state information in the monitoring.state file. It will maintain the previous server state and the application server PID. In case of an application server crash or hang, the nodeagent will get the previous state of the server from the monitoring.state file and then try to start the application server automatically.
Note: If you notice StringIndexOutOfBoundsException or any other exception in the NodeAgent.loadNodeState stack (nodeagent Systemout.log file), it means the monitored.state file is corrupted. You must stop all servers, delete the file and then start the nodeagent again. For example:

    Caused by: java.lang.StringIndexOutOfBoundsException
    at java.lang.String.substring(String.java:1115)
    at com.ibm.ws.management.nodeagent.NodeAgent.loadNodeState(NodeAgent .java:3210)

5. My application servers were monitored by the nodeagent. When the server was hung, why didn't the nodeagent restart the server?
Before I answer this question, please review the product documentation section "Monitoring policy settings" that explains how the monitoring policy works in WAS.
Nodeagent PidWaiter sends the signal every ping time out interval to get the status of the application server. If the PidWaiter does not get the response back from Application Server then AppServer is considered hung. Once the application server is identified as unresponsive/hung the nodeagent PidWaiter sends a SIGTERM to the process, which does not guarantee the process is immediately stopped. It sends the signal wait for the process to normally shutdown. If the server doesn't respond to any request, the server just stays hung forever.
If you want the server to be killed when it's hung or doesn't respond to the nodeagent ping, then you need to set "com.ibm.server.allow.sigkill" property to true in the nodeagent custom property. Please review section "Java virtual machine settings" in the product documentation for more information.


source: https://www.ibm.com/developerworks/community/blogs/aimsupport/entry/Recommended_Maximum_Heap_Sizes_on_32_and_64_bit_WebSphere_Java_instances?lang=en

Recommended Maximum Heap Sizes on 32 and 64 bit WebSphere Java instances

One of the most common questions asked in WebSphere Java support is, "What is the recommended Maximum Heap size?"
One of the most common problems reported in WebSphere Java support is, "OutofMemory" condition (Java or Native).
 
This blog is simply a starting point or general reference based upon daily observations in the technical support arena, it is not intended to be a solution for every situation, but moreover a general set of starting recommendations.
Ideally, you will need to test appropriate values in your own environments and topologies, based upon application architecture, number of applications running, how busy the AppServer is and underlying load conditions, how much physical memory or RAM is installed and running, how many JVMs are being hosted and what other additional Java and native memory processes are running on the same machine.
 
For 32 bit platforms and Java stacks in general, the recommended Maximum Heap range for WebSphere Application Server (WAS), would be between (1024M - 1536M) or (1G - 1.5G); higher values will most likely eventually result in Native Memory contention. For Node Agents and Deployment Manager, depending upon how many nodes are managed serviced and how many application deployments occur, you can probably utilize less heap memory, between (512M - 1024M) or (.5G - 1G) may suffice. *But the Default out-of-the-box configuration value of 256M is most likely too low a value in most use-case scenarios.
*Remember that the WAS Java process shares a 4 Gigabyte memory address space with the OS in accordance with 32 bit design specification (User Virtual Memory is 2G).
Application Server       1024M - 1536M                 
Deployment Manager    512M - 1024M
Node Agent                    512M - 1024M
 
For 64 bit platforms and Java stacks in general, the recommended Maximum Heap range for WebSphere Application Server, would be between (4096M - 8192M) or (4G - 8G). For Node Agents and Deployment Manager, depending upon how many nodes are managed serviced and how many application deployments occur, you can probably utilize less heap memory, between (2048M - 4096M) or (2G - 4G).
*Remember that the WAS Java process shares a 16 Terabyte memory address space with the OS in accordance with 64 bit design specification (User Virtual Memory is 8T).
Application Server       4096M - 8192M                 
Deployment Manager  2048M - 4096M
Node Agent                  2048M - 4096M
 
Now regarding the Minimum Heap value, we have found that when using the newer product versions WAS v8.x and v9.x with default GC Policy of GENCON, setting a 'Fixed Heap' works and performs best (Maximum Heap Size = Minimum Heap Size) as well as a 'Fixed Nursery'. You can fix the nursery size with a Generic JVM argument of -Xmn####m (example: -Xmn1024m for a 1Gb nursery region). Without -Xmn, the nursery region defaults to approximately 25% of the max heap size, but it is Variable and not Fixed. The concept of the GENCON GC Policy resizing heap regions was to keep them smaller for faster GC Cycles, but we have found out that in practice the overhead of this resizing often makes GC very inefficient. To navigate to the right area to set -Xmn, see "Setting generic JVM arguments in WebSphere Application Server."
 
The Heap values can be set any number of ways depending upon the actual product version of WebSphere, but typically from the Admin Console JVM process settings, from the Generic JVM Args (-Xmx -Xms), WSAdmin command-line interface, Startup and Deployment scripts, manual server.xml modification (not recommended) and so forth; more details and step-by-step instructions can be found in the corresponding WebSphere product documentation based upon product version, as well as related developWorks articles and Blog entries.
 
*Please also keep in mind that the overall WAS JVM Process Size or Memory Footprint will typically be larger than Maximum Heap size (upwards of 1.5x), simply because it includes not only the Java Heap, but also underlying Classes and Jars, Threads and Stacks, Monitors, generated JIT code, malloc'd JNI Native Memory calls and so forth. For example:
  • -Xmx4G (process size could be around 6G on a busy AppServer)
  • -Xmx8G (process size could be around 12G on a busy AppServer)
 
Caveat: This blog entry was primarily created for WebSphere Base and Network Deployment full version products, but I wanted to also quickly point out that when using some of our stack products such as Business Process Monitor (BPM), eXtreme Scale, Portal, Process Server, the Maximum Heap sizes or ranges may need to be a bit larger than what I specified above.

Tuesday, 14 August 2018

Configuring and Implementing Dynamic Caching in WebSphere Application Server

Dynamic Cache Equals Performance 


- Dynamic Cache is part of the IBM solution for improving the performance of Java 2 Platform, Enterprise Edition (J2EE™) applications running within WebSphere Application Server.

- Dynamic Cache supports caching of Java™ servlets, JavaServer Pages™ (JSP™), WebSphere command objects, Web services objects, and Java objects. ƒ

- This presentation describes the features and configuration steps of a dynamic cache environment for servlets and JSPs.

ƒ The concept of caching static information in a browser, proxy or a webserver provides an effective way to reduce network and processing requirements.

ƒ A larger overhead of many web applications is related to the serving, not of static content, but of dynamic content based on user input and retrieval of data from backend resources such as databases.

ƒ IBM’s Dynamic Cache solution allows the customizable caching of dynamic content which can provide a major performance boost for high volume web sites.

Enabling Dynamic Cache:


Monday, 23 April 2018

Websphere Server topologies

Single-server topology:


Provides an application server and a Web site. Initially, a single instance is configured to use an internal database. Convert to a cluster or server farm to improve capacity and availability. WebSphere Portal, WAS, and the database are all installed on the same server.



Optionally configure a Web server, with IBM WAS's HTTP plug-in to...



  • Serve static resources
  • Provide plug-in point for a corporate SSO agent in the event that WebSphere Portal participates in a global SSO domain



Standalone server topology

The stand-alone scenario is different from the single-server since the database server, LDAP server, and Web server software are installed on different physical servers than the IBM WebSphere Portal. This configuration enables you to distribute the software in the network and therefore distribute the processing load.
For a stand-alone configuration, we can use an existing, supported database in the network and an existing, supported LDAP directory. Configure IBM WebSphere Portal to authenticate with the LDAP server. The following illustration displays a common topology for a stand-alone server. The HTTP server, (also referred to as Web server) is installed on a server in a protected network. The LDAP server and database server are also installed on different servers. WebSphere Portal and WAS are installed on the same server.
Choose this topology if you do not require a robust clustered environment. We can also use this option to examine and test functions and features to decide how to accomplish the business goals. We can add this stand-alone production server to a federated IBM WAS Network Deployment cell. It creates a federated, unclustered production server.





Clustered servers topology

IBM WebSphere Portal uses a dmgr server type for clustering portal servers. All portal instances share the same configuration, including...


  • database
  • applications
  • portlets
  • site design
Clusters also provide a shared domain in which session and cache data can be replicated. The cluster provides an application synchronization mechaniso that ensures consistent application management (start, stop, updates.) across the cluster.
The HTTP Server plug-in balances user traffic across all members of the cluster. Session affinity ensures that a user remains bound to a specific cluster instance for the duration of their session. When cluster member is down, workload management will route traffic around it.
IBM WebSphere Virtual Enterprise provides dynamic clusters to dynamically create and remove cluster members based on the workload. The On Demand Router has all of the features of the HTTP Server plug-in with the additional ability to define routing and service policies. It is possible to deploy multiple portal clusters in a single cell.


Vertical cluster topology

A vertical cluster has more than one cluster instance within a node. A node typically represents a single physical server in a managed cell, but it is possible to have more than one node per physical server. It is very simple to add additional vertical clusters to a node, using the dmgr console, as each additional vertical cluster instance replicates the configuration of the first instance; no additional installation or configuration is necessary.
How many vertical instances that can be created in a single node depends on the availability of physical resources in the local system (CPU and memory). Too many vertical cluster instances could exhaust the physical resources of the server, at which point it is appropriate to build horizontal cluster instance to increase capacity if necessary.
The following diagram illustrates a vertical cluster. A single node has multiple instances of IBM WebSphere Portal installed. The node is managed by the dmgr. Each instance on the node authenticates with the same LDAP server and store data in the same database server. Although not depicted in the illustration, the database and LDAP servers could also be clustered if needed for failover, increased performance, and high availability.



Combination of horizontal and vertical clusters

Most large-scale portal sites incorporate a combination of horizontal and vertical clustering to take full advantage of the resources of a single machine before scaling outward to additional machines.



Multiple clusters

With multiple clusters, where each cluster is in a different cell...


  • One cluster can be taken out of production use, upgraded and tested while leaving other clusters in production, achieving 100% availability with no maintenance windows.
  • We can deploy clusters closer to the people they serve, improving the responsiveness of the content.

For the most part, each cluster should be seen as a totally isolated system, administered independently, with its own configuration, isolated from the other clusters. The only exception is with the sharing of the following portal database domains...


  • Community
  • Customization

These domains store portal configuration data owned by the users themselves.
Each cluster can be deployed within the same data center, to help with improving maintainability and improve failure isolation, or across multiple data centers, to protect against natural disaster and data center failure, or to simply provide a broader geographical coverage of the portal site.
The farther apart the clusters are, the higher the impact network latency may have between clusters and thus the less likely you will be to want to share the same physical database between clusters for the shared domains and will want to resort to database replication techniques to keep the databases synchronized.
Typically, in a multiple portal cluster topology, HTTP Servers are dedicated per cluster, since the HTTP Server plug-in's configuration is cell-specific. To route traffic between data centers (banks of HTTP Servers), a separate network load-balancing appliance is used, with rules in place to route users to specific datacenters, either based on locality or on random site selection, such as through DNS resolution. Domain, or locality, based data center selection is preferred because is predictably keeps the same user routed to the same datacenter, which helps preserve session affinity and optimum efficiency.
DNS resolution based routing selection can cause random behavior in terms of suddenly routing users to another datacenter during a live session. If this happens, the user's experience with the portal site may be disrupted as the user is authenticated and attempts to resume at the last point in the new site. Session replication and/or proper use of portlet render parameters can help diminish this effect.



    active/activeAll portal clusters receive user traffic simultaneously from network load balancers and HTTP Servers. If maintenance on one cluster is required, all production traffic is switched to the other cluster.
    active/passiveProduction traffic is routed to a subset of the available portal clusters (e.g. 1 of 2, or 2 of 3). There is always one cluster not receiving any traffic. Maintenance is typically applied first to the offline cluster, and then it is brought into production traffic while each of the remaining clusters are taken out and maintained in a similar fashion.


As an alternative to deploying multiple portal clusters where each cluster is in a different cell, it is also possible to deploy multiple portal clusters in the same cell. Different cells give you total isolation between clusters, and the freedom to maintain all aspects of each cluster without affecting the other. Different cells, however, require different dmgrs and thus different Administration Consoles for managing each cluster. Multiple clusters in the same cell reduces the administration efforts to a single console, but raises the effort level to maintain the clusters since there is a high degree of resource sharing between the multiple clusters.
While multiple portal clusters in a single cell has its uses, especially in consolidating content authoring and rendering servers for a single tier, it does increase the administrative complexity significantly. IBM recommends that multiple portal clusters be deployed in multiple cells, to keep administration as simple as possible.


Monday, 16 April 2018

Start Websphere Application Server automatically with shell scripting

#!/bin/sh 
#
# script to start and stop WebSphere Application Server 5.x
# WAS_HOME="/opt/WebSphere/AppServer" # WAS install dir 

SERVERS="server1 MyAppServer" # list of app servers
 if [ ! -d "${WAS_HOME}" ]; then
   echo "$0: ${WAS_HOME} does not exist, aborting" >&2
   exit 1 

fi 
case "$1" in 

’start’)  # increase resource limits
   ulimit -n 1024
   ulimit -s 16384 
   for s in ${SERVERS}; do
       ${WAS_HOME}/bin/startServer.sh $s 
   done
 ;; 

’stop’) 
    for s in ${SERVERS}; do
        ${WAS_HOME}/bin/stopServer.sh $s 
   done
    ;; 

’status’) ${WAS_HOME}/bin/serverStatus.sh -all 
 ;;

 *)
  echo "Usage: $0 " <start|stop|status>
    exit 1 
   ;;

esac