Healing Transformation
the_weirdest_conversation - the_graph_of_the_tribe_and_the_spectrum_is_a_sight_to_behold
I stopped at a taco shop on the way home tonight. I'd gone out for coffee and had been spite coding, creating a bridge into confluence from my markdown viewer. In the short term, it's an easier path to permanance and the whole point of the platform is empathy, so why not meet the confluence people in their space.
Their home, my content.
It's already a graph anyways..
"sections": [
{
"id": "7551ba44d928eb2e",
"path": "<nope>-technical-implementation",
"slug": "<still nope>-technical-implementation",
"level": 1,
"text": "<loop, it's not that exciting: Technical Implementation",
"metadata": {},
"content": [],
"comments": [],
"children": [
{
"id": "bbb72c4eccf3bb3d",
"path": "<squirrel>-technical-implementation.original-ticket-description",
"slug": "original-ticket-description",
"level": 2,
"text": "Original Ticket Description",
"metadata": {},
"content": [
{
"id": "66c53b477ea1dd68",
"type": "paragraph",
"raw": "**Summary**: Enable DNS resolution from <should know better>-prod to <scary land>-dev for <ai people>/<content people> dev endpoints",
"rendered": null
},
{
"id": "3f3e1026073bd06f",
"type": "paragraph",
"raw": "Prod workloads need to temporarily call <ai people>/<content people> dev endpoints for testing/mitigation. Today, prod cannot resolve dev hostnames. We would like resources in <should-know-better>-prod (Acct <most definitely not you naughty thing>) to resolve hostnames hosted in <scary-land>-dev (Acct <seriously, do you think i'm silly?>).",
"rendered": null
},
{
"id": "85df54baed2fd04b",
"type": "paragraph",
"raw": "Target domains (initial):\n- <internal domain in dev account>\n",
"rendered": null
},
{
"id": "780115eb121c6e75",
"type": "hr",
"raw": "---",
"rendered": null
}
],
"comments": [],
"children": []
},
{
the_long_and_winding_road - i_dont_like_molding_people_clay
I show you that for two reasons. One... markdown as a graph huh? That's cool - that to confluence is a lib away now. That's how you grow a garden, one flower at at time. I did the graph to make adding comments at headers/nodes easier. Now I can feed that directly to an xhtml converter and we're off. Lemme go find that guy and update his case, I'd forgotten it was even there until I went looking for the markdown to include in the post :)
It was an interesting day.
The story is roughly, new service needs access to ML, has really dirty data. ML is crashing with regex issues.
So you think... clean the data before sending, that doesn't need ML yet. Or fix the regex issues.
Nope, connect prod service to dev ML so if it crashes the impact isn't felt. I'm sure that costs nothing to scale out...
So they ask for help. The ML API is deployed on my platform. It's well established, been ticking over seemlessly for the most part. On their 60GB ECR images, deployed on Fargate... maybe not seemlessly but nothing the occasional ephemeral storage increase can't fix.
I'm a pattern matcher.
I suggest they use the private link we established a while ago that we use for four other existing accounts. I added NLB, with an extra one in front just in case so we can swap internals without having to redo everyone's links. I find it worth the cost.
Cloud engineering chimes in with "use the transit gatewy".
The what...?
That seems a bad idea. I expressed concern but then asked for their best practices on DNS across a transit gateway because I knew what was coming next.
Forward the private domain.
For a temporary workaround? To dev? No... maybe try a little harder?
By now the DevOps team that actually own the ML platform realize what is going on and slightly disagree with the idea as well but everyone's overruled. So someone realizes that they can use the load balancer's actual aws name and it will resolve, so they do that. Except host headers are a thing and so are shared load balancers and those typically aren't the types of rules we put in them.
I built the platform. I have some teams I provide direct support to, the ML team are not one of them and I am strongly hinting on the side to a DevOps engineer over there that the privatelink is the preferred architecture. I cannot perform any of the work unfortunately as much as it would easier than playing the games.
Finally, I hint to the dev team (who I am responsible for, and is a friend - at least a work friend) that the rule needs adjusting to accept the raw aws domain. He posts a brilliant analysis to the ticket. Where it lingered...
After all that they got through it, my friend did make a bit of a whoops. He then chimed back into the ticket that he needed to get to one of their own dev services and was getting the same error ("200 - OK / Nothing to see here move along", gotta have a sane default right?)
I pattern match.
I'll offer to help him tomorrow. I feel bad, but on the other hand... this is just a horrible idea all around.
the_transitional_space - growth_through_healing
What happens when you're me and this goes on for a month is that in your morning meeting - your safe space to chat with your friends (team) about what they're working on and showing them the cool stuff that's coming into the platform - your boss brings this ticket up.
I'm not fully healed. Still healing, but not this morning at that time.
Maybe that's just how I am though, that may be healed.
Anyways, I had to talk over him until he stopped. He hung up, I apologized to my team and then privately and somewhat less privately when I (somewhat purposefully) used speech to text to briefly follow up.
This brings us back to the beginning, nearly anyways.
starbucks_is_best_bucks - so_long_starbucks
Starbucks closed in my town. It was close, it was open late, I could sit and people watch.
It's not a coffee shop.
The one in town is nice, they have really cool old racing jerseys framed as abstract art.
I was thinking about what had happened and what was going on in the ticket when I was spite coding the confluence bridge and then I had an idea.
components:
default:
cloudformation:
template: cloudformation/vpc/nlb_with_vpc_endpoint_service.yml
parameters:
- AllowedPrincipals: "arn:aws:iam::<naughty>:root,arn:aws:iam::<naughty>:root,arn:aws:iam::<maybe you like this>:root,arn:aws:iam::<you are a good little boy and your mother is a saint>:root"
+ AllowedPrincipals: "arn:aws:iam::<fine oh you're so bad>:root,arn:aws:iam::<really... fine yes master>:root,arn:aws:iam::<weird in a tech blog?>:root,arn:aws:iam::<is that what this is>:root,arn:aws:iam::<oh finally you got here, it's a me - the new account id>:root"
AvailabilityZoneA: us-east-1a
AvailabilityZoneB: us-east-1c
AvailabilityZoneC: us-east-1f
# yaml-language-server: $schema=https://schemas.<this is just s3 and nodejs because why not>.com/devops/v2/pipeline.yml
---
pipeline:
metadata:
group: <content team>
name: <my dudes>-<ml team>-endpoint-prod-main-pipeline
tags:
environment: prod
sub_environment: main
system: <content dudes>.<ml team>.endpoint
version: 1.0.0
version:
strategy: static
args:
release: 1.0.0
stages:
deploy:
label: deploy-ec2-prod
deployer:
type: infrastructureStack
components:
default:
cloudformation:
template: cloudformation/vpc/vpc_endpoint_interface.yml
parameters:
VpcEndpointServiceName: com.amazonaws.vpce.us-east-1.vpce-svc-<go look at a puppy, seriously>
VpcId: vpc-12345678 # your prod VPC
SubnetIds: "subnet-aaaaaaaa,subnet-bbbbbbbb,subnet-cccccccc" # your private subnets
SecurityGroupId: sg-xxxxxxxx # security group for endpoint traffic
AWSTemplateFormatVersion: '2010-09-09'
Description: VPC Interface Endpoint for cross-account service access
Parameters:
<stuff>
Mappings:
<stuffy>
Resources:
VpcEndpoint:
Type: AWS::EC2::VPCEndpoint
Properties:
VpcEndpointType: Interface
ServiceName: !Ref VpcEndpointServiceName
VpcId: !Ref VpcId
SubnetIds: !Ref SubnetIds
SecurityGroupIds:
- !Ref SecurityGroupId
DNS:
Type: AWS::Route53::RecordSet
Properties:
HostedZoneName: !FindInMap [HostedZoneMap, !Ref "Environment", "zone"]
Name: !Ref "BaseURL"
Type: A
AliasTarget:
HostedZoneId:
!Select [0, !Split [":", !Select [0, !GetAtt VPCEndpoint.DnsEntries]]]
DNSName:
!Select [1, !Split [":", !Select [0, !GetAtt VPCEndpoint.DnsEntries]]]
PrivateDnsEnabled: false
Now you're going to say, why didn't you export them? Because that's more work - I have to remember to put in all the templates and give things names again - I just make an API call after constuction and extract logical / physical key pairs to my pipeline graph instead. Then we can just !Value <pipeline context>::RESOURCES.<logical id> with a cheap DynamoDB read.
Anyways, the point of all of that was that that was it.
An account id and two templates that one of the Claudes threw into my "show my boss what would have happened if they had have just followed my initial privatelinks suggestion" that was rendered in the graph. I'm not actually sure where it found them, we weren't in a repo that had deployment configurations. Maybe the recursive mirror does work.
My friend would have known what to do with the templates immediately. He's clever, I show him once or twice and he builds a mountain. We just needed the ML team to unlock the path.
growth_real_growth - zero_sum_zero_heart
Real change is done through commmunity. Because you need to get everyone on board.
After all this, after I restored enough executive function to think this through, I called my boss from that coffee shop. He was helping my friend through his DNS / routing issue. I told him what would have been necessary if we'd just been allowed to do it our way, or tell them what they needed to do directly instead of playing politics. He knows what I've built and he understood the simplicity of what I showed him, especially given that four other accounts were established with this pattern.
Then I went to the library. (and promptly left a pair of my favourite sunglasses somewhere in the stacks)
Then on the way back to my car, blissfully unaware of the missing sunglasses and absorbed in thought, I saw a couple of women in a local taco bar and they were having a lovely time and I was hungry so I went in. Sat near the bar, the owner or someone was cutting new drink menus. The hand designed dinner menus were lovely, amazingly so. I complimented him on it and he seemed thrilled "we all help out around here".
Then I cleaned up the response I sent to both my boss and his boss, we meet regularly to discuss all kinds of things, including my mental state at times ;)
In this case, I gave him his play. See I've been trying to get another teammate on board full time. His division was cut and he's on temporary time. He's brilliant, quietly so. And he gets it, he solved his own problem yesterday while I was trying to help him out with a quick PR :) And he was the first one to get MD as an intent distribution system, send a set of rules for data transforms and let them go to town.
---
name: grow_together
entitites:
- name: best_boss
attributes:
communication_style: short and direct
enough_resources: false
likes:
- structure
- discipline
- order
- me
trusts:
- results
- me
effects:
- target: graeme
selection_criteria:
- expression: attributes.enough_resources = true
- expression: attributes.trusts.include?(me)
data:
cover: true
enough_resources: true
time: true
- name: kind_overseer
attributes:
communication_style: listener
enough_resources: false
likes:
- making delicious food
- posing in questionable photos
- being kind
- smiling
- me
responsible_for:
people:
- best_boss
- another_guy
platforms:
- ## ERROR_CONTENT_TOO_LARGE ##
trusts:
- results
- jeff
- me
effects:
- target: those_with_a_purse_and_no_soul
selection_criteria:
- expression: graeme.attributes.delivered.include(results)
data:
budget: this_with_a_purse_and_no_soul.budget.decr_pct(25)
efficiency: this_with_a_purse_and_no_soul.efficiency.incr(10)
security_compliance: this_with_a_purse_and_no_soul.security_compliance.incr(10)
- target: best_boss
selection_criteria:
- expression: attributes.enough_resources = true
- expression: attributes.trusts.include?(me)
data:
enough_resources: true
- target:
- name: those_with_a_purse_and_no_soul
attributes:
budget: 20m
communication_style: vague but predictable
efficiency: 35
likes:
- money
- efficiency
- results
security_compliance: 45
trusts: []
effects:
- target: kind_oversser
selection_criteria: [] # TODO - this
data:
enough_resources: true
responsible_for: kind_overseer.responsible_for.platforms.push(ml_system)
- name: graeme
attributes:
cover: false
delivered: []
enough_resources: false
platforms_supported: kind_overseer.responsible_for.platforms
time: false
growth_by_institutional_laziness - tend_the_garden_watch_it_bloom
I don't set out to build empire, I set out to make my life easier. Option A: a month of back and forth politics and working the systemn. Option B: an account id and two pre-generated templates, copy and paste. Copy a Jenkins job, 5 minutes and it's done. Option A: bridge production and development, two massive flat networks with no security isolation (CloudEngineering know this they set it up before we were acquired and they moved upstream). Option B: proper isolated cross account access.
So I offer a solution that makes everyone happy. Appeal to the primate instinct to move up the ladder, more responsibility means more power. Appeal to my boss who's been in the middle playing cover as best as possible (and taking it from both sides sometimes - sorry!!!).
Literally 5 minutes work if I'd just been allowed to do it.
Give me a few more people that can understand a state transition and we'll just do it. And we all win. I get a team to train, people who can learn to operate the system and hopefully build it with me. My boss gets a larger team to manage, and we excel, so he excels by default and I'm glad for him too. He plays the cover game well.
His boss gets to look good on the other side of the pecking order. At director level, it's all about results and he can deliver them if he lets me ;)
At the organizational level, like always we can do more with less. It shouldn't take a team of trained DevOps engineers a month to work out a solution to something like this when it's been handed to them on a YAML plate. It's a month of opportunity cost in time to market, the whole point of ML and DevOps.
Agile is broken in my organization, they've forgotten how to talk.
I'm not building empire though, I'm building community. Those that remember how we get along, those that I've built bridges too along the way - my friend the poor content engineer who just needed the door unlocked - we grow together if we build together.
I built them the solution a year ago.
Community does require participation though, and part of that is eduction and curiosity so that we can all participate from a shared understanding. Not everything of course, kind_overseer doesn't need the technical details or know how, but the DevOps engineers operating their ML platform on top of mine certainly should.
Option A: A month Option B: 30 minutes
Option A: Opportunity cost, interpersonal and interdivisional friction Option B: Just another Tuesday
munchkins_..._on_my_golden_path? - whos_the_black_sheep
I'm building simple systems that compose into elegant solutions. Empathetic solutions to help share knowledge, build communication.
I miss my garden, I miss watching it bloom. I miss hearing "and then we..." and have someone list of a list of things my platform let them build, Because that's what I'm for, the substrate that powers creativity and delivers results.
So I'm rebuilding it, one transition at a time. It's weirdly wonderful in here and I'll show you everything :)
I found them by the way, library people are kind. Someone had returned them to the desk and they were there waiting for me. I'd had a lovely discussion with the man at the taco bar about primate hierarchy and its relation to what I was doing sitting in his bar talking to a teady bear and was really hoping my day wasn't ruined - and it wasn't!!! :)