This is a new role being created to support the lab infrastructure for the Teradata Client and Platform Engineering Department. The responsibilities include the following:
1. Accounting Services
Providing bookkeeping services to TCPE development and test environment assets (Intel-based servers, Dell-based servers, native UNIX servers, network switches, WAN simulators, adapters, cables, software, licenses, and the mainframe). This includes:
o Keeping an inventory of everything, including servers, connectivity hardware, software, licenses, etc.
o Determining which assets require support and what amount/type/level of support is required.
o Renewing contracts with expiration dates in a timely manner.
o Handling/Processing incoming and outgoing loaners from Flextronics.
· Determining when servers/hardware should be retired and handling the “surplus of equipment” process.
· Provisioning assets for general or exclusive use by development teams. This includes:
o Creating and maintaining a calendar-based repository with machine and/or VM image ownership.
o Ideally, the provisioning system should be nimble enough to allow for hourly reservations or at least reservations that last a few days.
o Monitoring and responding to reservation requests.
o Having reservations readily and easily available for development teams so that conflicts can be avoided or at least minimized.
2. Environment Configuration Services
· Maintaining systems. This includes:
o Checking on a regular basis to make sure servers and/or VM images are operable.
o Testing every system’s available after a power event (e.g., annual shutdowns, power surges, other planned downtime).
o Cleaning up systems and/or images by creating an automated process to restore a system to a known good state following development team use.
o Provide time-limited and QID based super user access to easily determine who’s made system changes.
o Planning operating system upgrades (patches and versions) based on supported platform and version charts.
o Restrict the usage of the root file system unless it is known and managed by administrators, and it should only be granted in special circumstances.
· Providing disaster recovery. This includes:
o Determining which servers and/or assets need to be backed-up. Within those systems backed-up data could include source code, operating system assets (e.g., compilers, build scripts, etc.), and test data.
o Drafting and publishing disaster recover (DR) plans.
· Configuring development and test environments. This includes:
o Refreshing servers and/or virtual Windows, Linux, and Mac OS X images.
o Cabling machines together and/or creating private LANs.
o Connecting various hardware devices (e.g., WAN simulator, big switches, servers, IB adapters, etc.).
o Assigning lab power to cabinets. As systems are allocated to the various lab spaces, power must be provided either by engaging an electrical contractor or by assigning/re-assigning existing power outlets. Maximum power must not exceed panel capacity.
o Monitoring the cooling environment to make sure equipment does not overheat.
o Notifying users in the event of a power or cooling failure or when the power company, requires systems to be shut down due to peak power requirements (hot days).