This is the third installment of our Performance & Scaling series, a collection of blog posts and technical articles on improving the speed and efficiency of your Exago BI environment. Last month, we discussed data optimization. This month, we’ll look at best practices for planning your application architecture, most of which advance the following two objectives:
- Distribute workload among processors
- Store definitions externally in case of server failure
The application architecture should facilitate load balancing, allow for the easy addition of new engines, and preserve performance in the event of a failure. An ideal system adapts to stress—be it increased internet traffic or a high volume of concurrent report executions—by dispersing stress among nodes in the network.
Now we’ll discuss methods for creating such an adaptive architecture for Exago BI.
Load Balancing the Web Server
Web farms load balance all of a web server’s responsibilities by storing session state in a location accessible to all farm servers, typically a state server. (This is known as out-of-process session storage, as opposed to in-process storage, which is the means of storing session state if an external state server isn’t used.)
Let’s say a user wants to see a set of possible filter values for a data field. This list must be retrieved from the data source. Within a web farm, the call would travel from the browser to a load balancing server responsible for routing the call to one of the web servers in the farm. The browser, which stores an id for the session in a cookie, passes that information to the state server, which uses the id to look up the session’s configuration and state (i.e., what reports are open, whether or not they’re saved, etc.). Then the appointed server handles that one call, querying the database for the filter values and sending them back to the browser.
Next time that user clicks to display a filter dropdown list, the whole process will repeat, and the call may go to an entirely different server, unless the system is using sticky sessions. Sticky sessions route all calls for a particular session to the same web server each time, storing the session on that server instead of in an external location. A significant drawback to this method, however, is that if any server goes down, its sessions will be lost. State servers, by contrast, preserve the session definitions independently of the web servers, controlling for the possibility of server failure. Of course, the session server could also crash, in which case the entire application would be inaccessible. The most secure means of storing sessions, therefore, is to house them in a distributed database where redundancy will minimize risk.
All environments, but especially those using a web farm architecture, are advised to also implement a folder management .NET assembly. Exago BI stores reports, folders, document templates, and report themes in a directory by default, but folder management allows admins to customize that storage. Many opt to store these files in databases, which have a number of advantages over file systems. Databases scale better, have built-in redundancy to protect against server crashes, provide opportunities to collect additional information around report definitions, and can support more sophisticated permissioning.
If you elect to host your application in the cloud, the system will use a series of virtual machines (VMs) to load balance calls to Exago BI. VMs are faster to spin up than physical machines, as they require no new hardware and come configured to your specifications as per your hosting contract. The VMs used in cloud-based architectures help facilitate easy and efficient scaling.
Load Balancing the Scheduler Servers
Environments using multiple scheduler servers are advised to make use of Exago BI’s scheduler queue, which allows for the custom storage of scheduled reports as well as the load balancing of scheduled executions. Instead of distributing schedule jobs to the allotted servers in a round-robin fashion, the queue stores the jobs in a external location (typically a database) and assigns individual report executions to servers based on their availability. This dramatically improves the scheduling service’s performance and, as in web farms, prevents the failure of a single server from taking multiple scheduled jobs offline.
Optimizing Session Efficiency
When storing state on a state server, the most time-consuming step in a session call is building the session configuration.
Ordinarily, most of the configuration is defined in a static config file, and some changes are made to the base configuration via the API on a session-by-session basis. Only those dynamic changes (known as the “delta”) are stored with the session state. Each time the browser calls the application, the state server (we’ll stick with that setup by way of example) indexes the session id, finds the stored delta, and applies it to the base configuration to “build” the full configuration for that session. This building process takes time and resources, but it’s usually preferable to storing the entire configuration with the session state. Config files can be large and the delta is usually small enough that it doesn’t take too long to apply.
In some cases, however, the base configuration is small compared to the delta. Let’s say, for example, that the delta is 90% of the effective config, and the base comprises the remaining 10%. Building the configuration file would take considerably longer than if the ratio were reversed. And by storing the delta, the state server is already storing 90% of the total config. In these instances, it is recommended that you use configuration caching, which allows the web server to store configuration data that will not be modified at runtime. Used correctly, this will go a long way toward speeding up your session initialization and reducing your memory footprint.
So, to recap, optimizing your application architecture for performance and scaling is a matter of distributing workload among servers and storing definitions in a safe, external location. This will help buffer the impact of high traffic or heavy workloads as well as protect performance from periodic server outages. If you have any questions regarding architecture optimization or deployment, let us know in the comments. Stay tuned for our next installment in this series, which will focus on optimizing Exago BI’s configuration settings for performance and scaling.
Thumbnail and Banner Photos: A spiral staircase in Antoni Gaudí’s masterpiece, La Sagrada Família in Barcelona, Spain. For insight into the architectrual principles behind this structure, visit 99% Invisible. These modifications of “Sagrada Familia” by Ted & Dani Percival is licensed under CC BY 2.0.