The MMI Cloud Services Operations team provides both service delivery and service assurance data and software services to meet, and preferably exceed, the business
and organizational goals and ever-changing requirements for MMI. Our Operations and Infrastructure team supports MMI's service delivery utilizing state-of-the-art
server, network, database, data warehousing and data center processes and technology.
We hire leading-edge people to develop forward-thinking, cloud computing services. Our people are key to our considerable success.
Making use of infrastructure technologies such as:
NetApp 3160's, Virtualization, Hadoop, Cassandra, HBase, MySQL, Oracle Rac, Various RedHat Variants (Oracle Unbreakable, CentOS, RedHat), Jira, Cacti, Puppet. There is
a minimal amount of Windows as a part of the installed base.
Service delivery allows for existing and new service functionality to be provided in a consistent way and with predictable results meeting service level expectations (availability, performance, cost, timing). It includes the infrastructure build out, completion of the on-boarding process to ensure all operational requirements (capacity, throughput, latency, redundancy, etc.) are met prior to production launch, It also assures that production-grade operational documentation, monitoring and other processes are thorough and well understood.
Service assurance provides production service monitoring, diagnostics and service restoration in a timely manner meeting service level expectations (availability, performance).
It includes change management, incident management and problem management activities.
Being part of the Advanced Platforms organization allows Operations and Infrastructure personnel to deploy, support and implement next-generation technologies which deliver services to more than eighteen million actively supported end users.
Scope of Responsibilities:
- Design, Build out, Ongoing support of Production Server environment. ( > 5000 servers)
- Support of Production environment running our own applications in the Datacenter three US based Data Centers, and one China Based Data Center.
- Troubleshooting production problems, working with Eng and other teams to help resolve production stability issues
- Troubleshoot ongoing problems, isolate down to the server, and fix Hardware and or Software Linux OS issue.
- Server Operations support of Projects, kick starting new hosts, management of changes to configurations adding new hosts to be Monitoring and reporting against.
- On-call rotation as a part of the team responsibility. Potential for follow the Sun support at times utilizing resources within China, India etc.
- Advanced level troubleshooting skills to resolve hardware/OS and networking issues
- Excellent Redhat Linux knowledge and experience
- Excellent understanding and troubleshooting skills on kick start, dns, dhcp, remote console access, linux kernel and networking.
- Excellent Scripting experience on Bash, Perl/Python/Ruby to help with automation of routine tasks.
- Strong understanding of Intel hardware servers running Linux operating systems
- Working experience on configuration management such as cfengine, puppet or chef
- Working knowledge with Xen and other Hypervisor implementations.
- Must have experience supporting large and complex Data Centers
- Strong Working knowledge with Apache Tomat/jetty
- Understanding LAMP like implementations
- Understanding the need for LoadBalancers within the production stack.
- DESIRED: A good working knowledge of networks and networking infrastructure and the integration points with the data canter / labs