Re: Examples of existing infrastructures?

Monday, 29 March 2021

hey hvjunk & maciej,

thanks for your quick and helpful responses.

On 29.03.21 01:04, hvjunk wrote:
...
 I’m running a couple of ProxMox clusters on bare-metal servers [...]
and use a “jump host” playset on another “separate and separately protected” environment
that I use for creating users etc. [...] deployments to the hypervisors. I then have
jumphosts that I created from my initial debian template [...] that is also provisioned
from the same “jump host” as I use for the ProxMox hypervisors. There I configure the
setup for Ansible/Debops as well as the various client’s logins to their jumphost.
Just to check if I understood correctly: at first you have a dedicated 
ansible controller (are you using the term jump host as a synonyme?) for 
provisioning the proxmox instances AND the initial "stack" jump hosts / 
ansible controllers as entrypoints into their subnets?
...
 My bootstrap “tip” for VMs Thanks for the MAC into DHCP / 2nd
playbook tip! Right now I have a 
debopsed PXE server set up pushing out a custom preseed config with full 
disk encryption enabled, since that is a requirement for me. My workflow 
for creating a new VM is cloning a QEMU template, then manually setting 
it up to run the encrypted install, then running a complete debops 
common playbook, which also enables cryptsetup & dropbear_initramfs. 
After that I would run it again with more debops / other services enabled.
...
> - How are the roles separated onto different hosts? provisioning
order, network design, security zones, etc.
 Depends yet again on the specifics for each stack, but my main template had been:
FortiGate-VM Firewall on the hypervisor’s public interface, connecting to the jumphost in
a DMZ and the production/staging/dev/qa servers each in their own subnet, only reachable
via the jumphost for port 22 and only the front-end webserver accessible from the
outside.

 As I have a proper FortiGate-VM in play, I can do proper limiting of outgoing traffic and
SSL deep inspection of outgoing traffic so that way I also force as much as possible
DNS/apt-caching/etc. to internal servers, nd the devs needs to help me specify the
specific outside resources they need to access. 
Sounds great. The location of jump hosts is still confusing to me, 
though. It sounds like you have:

WAN -- FW -- jump host -- LAN1
                                       |_ LAN2
                                       |_ LAN3, etc.

one single jumphost, that has NICs in every Subnet / VLAN? Where is the 
ansible controller then? Also how to prevent that jumphost being a 
single point of failure?

...
> - How do you handle secrets?
 yeah… THAT is a problem… I've used debops-padlock so far and commited the
encrypted secrets, but 
found then that the organization of my own gpg / ssh keys is really 
lacking (securely synchronizing between my clients, number and 
separation of keys, etc.) Ordered a nitrokey, maybe this will help me 
become more organised in this regard.
...
 yeah well, beware of over automation ;) Oh, I can identify :D
"Automation" lured me with the promise of getting rid of meticulously 
crafted snowflake servers, but I stayed for the possibility of having a 
huuuuge (well-configured) cluster of services I'll never have the time 
to really use :D

-------------

Maciej's mail:

On 29.03.21 10:49, Maciej Delmanowski wrote:
...
 Keep in mind that DebOps and Ansible are just a set of tools that
provide
 abstraction to the underlying Linux ecosystem. Since you wrote that you are
 starting with system administration, I'm curious - did you try managing a host
 or two by hand, without any automation? Yes, I probably should have written
"starting with automating / 
designing server infrastructures". I have some experience of 
administrating different kinds of windows (trying to get rid of those 
responsibilites, I'm telling you) / linux / unix servers, but never 
learned administering any of it from scratch. I "fell" into 
administration being the person that knew the most IT in a NGO like 
group that needed certain webservices. Thus my knowledge is shaped 
"organically" around the problems that arose setting these up. So when I 
started using debops (which I found researching more efficient ways to 
admin) I knew the basics of setting up a server, but was mostly ignorant 
of concepts "tying hosts together" like ldap, a pki, pxe, advanced 
dhcp/dns etc. Also debops seems to be very thorough in setting up a 
system, reading the common playbook I stumbled upon so many things I 
haven't done / known before, like apt_proxy or tcpwrappers

...
 [Ressources] 
Thank you so much for those, very valuable.

...
 With a completely new environment, I would try and find all the
quirks the
 provider has - do the hosts have the proper DNS PTR records available, are
 'ansible_hostname' and 'ansible_fqdn' variables resolved properly, and so
on.
 When you have the bootstrap.yml playbook working as expected, it's all pretty
 easy from there. I usually apply the common.yml playbook first and then
 inspect the host to see if basics are set up correctly - PKI realms, firewall,
 expected UNIX groups and user accounts. Afterwards, it all depends on the
 purpose of a given host. Do you test for these quirks automatically or manually?
...
 It all depends on the available resources. Do you have access to a
single
 beefy hardware machine? Just cram everything in there, most DebOps roles are
 designed in such a way that there shouldn't be conflicts. If you can set up as
 many VMs or containers as you want, then it might be a good idea to separate
 different applications into different hosts, or even create multiple instances
 of a given application with proper redundancy. Some roles like slapd are
 designed with this kind of operation in mind, check their documentation. 
What I am doing right now is designing an ideal lab environment on a 
proxmox host with sufficient ressources (for testing). In it I try to 
focus on finding a balance between secure separation of services and 
efficiently using a VM's given ressources. So far I have

opnsense: DHCP / DNS / FW

Host1: Controller / PKI / Jump Host (?)
Host2: PXE / Preseed / Apt-Cache
Host3&4: slapd-servers
Host5: Monitoring / Logserver
Host6: DB-Server
Host7: Jenkins
Host8: Nextcloud
Host9+: Appservers

Hopefully, this blueprint makes it easier to adapt to not-so-ideal 
circumstances, but at least it's a good lab to learn debops.

...
 The 'site.yml' DebOps playbook has the order of various roles
designed pretty
 well, from setting up basic host services like firewall, SSH, networking, then
 further to various databases and backend services, finishing on end-user
 applications like GitLab and Nextcloud. If you plan to write your own roles
 to deploy applications, it's a good practice to check the playbooks of similar
 software stack included in DebOps to see what might be needed for your
 software deployment. 
Should I try to find the right position for my custom roles or is 
appending it to a site.yml in my inventory enough? If not, how do I 
specify a position without editing the debops code?

...
 Jump Hosts
 The SSH protocol is pretty versatile here. [...] Sorry, this is getting redundant,
but in your example, where is the 
ansible controller located? Do you have one in your "home environment" 
that controls all of your different debops environments or one 
controller in each debops environment to which you (manually) connect 
through a jump host?

BTW: for others learning this I found this blogpost an excellent 
explanation of different ssh tunneling mechanisms: 
https://leftasexercise.com/2019/12/23/using-ansible-with-a-jump-host/

...
 Secrets are handled by the 'debops.secret' role with either
EncFS or git-crypt
 to provide encryption at rest. Of course it's best if you use that on top of
 an encrypted filesystem to not leave traces on the hardware, and don't publish
 repositories with secrets on public websites like GitHub to minimize exposure. 
Having started to use git just now this makes me super anxious to use 
public repositories, but finding out about debops-padlock eased that a bit.

...
 It's a chicken-and-egg problem [...] 
Thanks for the detailed walkthrough. I did only find out about 
bootstrap-ldap via the newly added sssd role documentation, 
bootstrap-sssd seems to be working similarly. How do I know which tasks 
to skip? Probably by knowing the code, right? :)

...
 Over the years developing DebOps and helping people debug issues in
their
 infrastructure I can say that each person's environment is different. It all
 depends on the purpose of the infrastructure - web applications will have
 different requirements than a HPC cluster, which will have different
 requirements than a backup storage array. The recent trend of going into the
 Cloud can cover probably 70%-80% of common use cases pretty easily, especially
 when you just get perpared VMs for your applications akin to Heroku. But
 handling your own infrastructure properly from the ground up is still a skill
 which you acquire over years of practice. So don't get discouraged if you
 stumble on a roadblock - everything can be either fixed or redesigned if
 necessary. 
Thanks for the encouragement and the support! As strange as it may sound 
I'm having great fun learning debops and its multitude of possibilities. 
Just feel overwhelmed by it at times, but that keeps me motivated.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

Re: Examples of existing infrastructures?