Here’s the revised checklist with added new lines for better readability:
—
## RGS 1.7.0 Deployment Checklist
### Overview
– **Objective**: Deploy RGS 1.7.0 successfully, ensuring all components and services function correctly with the new version.
– **Key Changes**:
– Upgrade to Cassandra 4.
– Transition from camelcase to snake_case identifiers.
– YAML structure overhaul.
### Pre-Deployment
1. **Database Upgrade**
– Ensure Cassandra is upgraded to version 4.
– Verify that all table and column identifiers are transitioned from camelcase to snake_case.
– Confirm that `CREATE_IF_NOT_EXISTS` is removed from all relevant YML files to prevent camelcase table creation.
2. **Schema Updates**
– Apply the 1.7.0 schema updates from the [Github Schema project](https://github.com/Everi-Digital/rgs-server-cassandra-schema).
– Execute the following script on Frey, Lannister, and Stark to recreate the `slot_activesession` table with tombstone management improvements:
“`sql
CREATE TABLE lannister.slot_activesession (
platformid text,
jurisdictionid text,
operatorid text,
userid text,
sessionid text,
gameid text,
lastplay bigint,
PRIMARY KEY (platformid, jurisdictionid, operatorid, userid, sessionid)
)
WITH CLUSTERING ORDER BY (jurisdictionid ASC, operatorid ASC, userid ASC, sessionid ASC)
AND gc_grace_seconds = 10800
AND additional_write_policy = ’99p’
AND bloom_filter_fp_chance = 0.01
AND caching = { ‘keys’ : ‘ALL’, ‘rows_per_partition’ : ‘NONE’ }
AND comment = ”
AND compaction = { ‘class’ : ‘org.apache.cassandra.db.compaction.LeveledCompactionStrategy’, ‘tombstone_compaction_interval’ : 10800, ‘tombstone_threshold’ : ‘0.05’, ‘unchecked_tombstone_compaction’ : ‘true’ }
AND compression = { ‘chunk_length_in_kb’ : 64, ‘class’ : ‘org.apache.cassandra.io.compress.LZ4Compressor’ }
AND default_time_to_live = 0
AND speculative_retry = ’99p’
AND min_index_interval = 128
AND max_index_interval = 2048
AND crc_check_chance = 1.0
AND cdc = false
AND memtable_flush_period_in_ms = 0;
“`
3. **YAML Configuration**
– Migrate YMLs to the “main” branch in the spring-config-repo.
– Update and validate `application.yaml` for environment-specific variables.
– Ensure the `APP_ENVIRONMENT` variable is configured correctly for the environment.
4. **Open Telemetry**
– Verify Open Telemetry is correctly configured or disabled depending on the environment.
– Ensure the required environment variables (`OTEL_EXPORTER_OTLP_ENDPOINT`, etc.) are set.
### Deployment Steps
1. **Service Updates**
– Deploy updated services by following the taskDeployer script, ensuring all services are at the correct version.
– Verify that new services, such as `Tablegame`, are configured and started successfully.
– Check for and resolve any startup issues, such as container errors or task failures.
2. **Coordination with LiveOps / DevOps**
– Coordinate any CDN updates with LiveOps.
– Monitor the deployment process closely for any errors or issues that arise.
3. **Post-Deployment Checks**
– Verify the proper operation of all services (e.g., `round-monitor`, `spinfusion`).
– Ensure all Cassandra schema changes are applied and functioning correctly.
– Confirm YAML configurations are correct and services are operating as expected.
### Monitoring
1. **Logs and Alerts**
– Monitor CloudWatch logs for any issues during deployment.
– Set up alerts for critical errors or service failures.
2. **Post-Deployment Validation**
– Validate the deployment by checking service functionality, data integrity, and overall system health.
– Coordinate with QA to ensure all critical paths are tested and validated.
### Deployment Notes
– **Common Issues**:
– Issues with Cassandra 4 and `num_keys` requiring configuration adjustments.
– Potential startup failures due to YAML misconfigurations or missing files.
– **Troubleshooting Steps**:
– For `CannotPullContainerError` issues, clear non-running Docker containers.
– For task failures, verify all dependencies and environment variables are correctly set.
It seems that your colleague has made significant progress in constructing the live deployment checklist for the 1.7.0 release. To complete the work, here’s a structured outline and the steps required:
### 1. Finalize Pre-Deployment Procedures/Checklists
– **Version Checks**: Ensure all microservices and the ECS cluster are running the appropriate versions for the 1.7.0 deployment.
– **YAML Files**: Ensure all YAML files have been updated as required for 1.7.0, especially considering the changes to the Spring Config Repo and the new `tablegame.yaml`.
– **Cassandra Schema**: Confirm that the Cassandra 4 upgrade has been correctly implemented, including the renaming of camelCase to snake_case identifiers and the handling of `num_keys` values.
– **Jenkins Build**: Verify that all services and protocols have been built on Jenkins without errors.
### 2. Deployment Procedure
– **Order of Microservice Startup**: Follow the specified order of starting services to avoid dependency issues.
1. Start Cassandra
2. SpringConfig
3. eureka
4. configService
5. gamestate
6. admin
7. auth
8. protocol services
9. progressive
10. gamehost
11. slot
12. prizegen
13. gateway
14. Bingo service
15. report
16. Localization
17. SpinFusion
18. Round Monitor
19. Accounting (Test servers only)
### 3. Post-Deployment Validation
– **Monitoring**: Set up or validate monitoring for all services, particularly focusing on those with known issues such as `spinfusion` and `prizegen`.
– **Ops-Genie Heartbeat**: Confirm that the heartbeat service is correctly set up if it’s a new environment.
– **Game Configuration**: Ensure the `platform.clientconfig.client.servicePath` field is present and correctly configured.
– **Round Cancellation Configurations**: Verify the new `cancel-round-flag` and `cancel-round-age` YAML values and ensure the cron job for canceled rounds is set up.
### 4. Troubleshooting and Validation
– **Database and Task Issues**: Document any issues encountered during the deployment, such as the `CannotPullContainerError` or issues with starting tasks, and provide resolutions.
– **Configuration Compatibility**: Check backward compatibility for environments not yet upgraded to Kubernetes and adjust the configuration if necessary.
### 5. Documentation and Knowledge Sharing
– **Update Documentation**: Ensure all changes, procedures, and troubleshooting steps are well-documented in your internal wiki for future reference.
– **Team Communication**: Share the checklist and any important updates with the RGS teams and DevOps to ensure everyone is aware of the changes and can support the deployment effectively.
Let me know if you need more detailed steps or additional information for any specific part of the deployment.