Dear Loyal Readers,
I’ve been using bind in AWS for a while now. Initially route53 created publicly available DNS entries, but recently, AWS added the private/vpc-aware Route 53. This makes our security team happy! Sadly, the record propagation delay was causing us problems (long delays) in our ability to quickly spin up new instances.
We might have been able to live with long instance creation times, but then we had the problem of being on a private network. Our project was not completely self-contained, and we needed to interact with other groups. Just as we needed to know their records, other groups would need to know ours. So we needed to push them up to the main DNS server.
I was able to train some people to manually edit zone files, but first they kept forgetting to make the reverse lookup (PTR) records. So I scripted this for them:
#!/usr/bin/env ruby revlook={} require 'date' subdomain="chowski.local" ls=IO.readlines(subdomain); ARECORD_REX = /^(?\S*)\s*(!?IN\s*)?A\s*(?\d+\.\d+\.\d+)\.(?\d+)$/ revlook={} ls.reject! { |l| l =~ /^[#;].*/ }.each { |l| m = l.match( ARECORD_REX ) if m key = m[:subnet].split(".").reverse.join(".") revlook[key]=[] if revlook[key].nil? revlook[key] << [ m[:forth], "#{m[:host]}.#{subdomain}" ] end } #el#136.103.10.in-addr.arpa. IN NS ns.aws.chowski.local. #el#37 IN PTR el2-gss-chf1.aws.chowski.local. puts "Make sure you have (/etc/named.conf):\n" revlook.keys.each { |revsubdomain| revfile=File.open("#{revsubdomain}.in-addr.arpa","w") id=DateTime.now.strftime("%Y%m%d%H%m") header =<<MULTI @ IN SOA #{revsubdomain}.in-addr.arpa. el2-gss-dns1.#{subdomain}. ( #{id}; Serial 43200 ; Refresh 3600 ; Retry 3600000 ; Expire 2592000 ) ; Minimum #{revsubdomain}.in-addr.arpa. IN NS el2-gss-dns1.#{subdomain}. MULTI revfile.puts header revlook[revsubdomain].each { |l| revfile.puts "#{l.first}\tIN\tPTR\t#{l.last}." } revfile.close puts namedconf } #keys.each
Next, we needed dynamic DNS (1) (2), because the networking (including DNS) needs to be setup (automatically) before our Configuration Management tool can begin to install the middleware.
RHEL Tip: chown named:named /var/named/ sudo chmod 770 /var/named
Then last week happened.
The A records from AWS’s databases (RDS) and load balancers (ELB) have a low expiration (60 seconds). I guess AWS needs its SaaS offerings to be more dynamic as they reserve the right to rebuild the front end if they detect a problem. Also, ELBs will scale up if need be (and so cover more IPs). This constant cache expiration exacerbated a problem between my local bind server and the upstream private DNS cluster. Just like in testing, I want to reduce my dependence on any external factors. I have no control over the upstream system, and I always want to be in control.
My solution: Cache the DNS hits myself. In the following script, I’m going to make A records in my sub-domain for every ELB/RDS instance in AWS (instead of CNAMEs).
- If I get a lookup miss, or the ip hasn’t changed, then I can just not update it, and bind continues to serve up the last known ip.
- If I get an difference, I’ll build the nsupdate script, and execute it against the local bind server (using the private key for authentication).
#!/usr/bin/env ruby require 'resolv' require 'tempfile' require 'socket' SUBDOMAIN="aws.chowski.local" aws_alist = %w{ myexampledb.a1b2c3d4wxyz.us-west-2.rds.amazonaws.com myexampleelb.a1b2c3d4wxyz.us-east-1.rds.amazonaws.com } Dir.chdir( File.dirname $0 ) if ARGV.include?("debug") jdebug=true outstream=STDOUT outstream.puts "STDOUT used in debug" ARGV.delete("debug") else jdebug=false outstream=File.open("#{$0}.log","w") end outstream.sync=true if ARGV.include?("force") jforce=true outstream.puts "Force update enabled, skipping cache check" ARGV.delete("force") else jforce=false end unless ARGV.count == 1 outstream.puts "Usage: #{$0} uploader.key" exit 1 else keypath=ARGV[0] end raise "keyfile(#{keypath}) does not exist" unless File.exists?(keypath) myip = Socket.ip_address_list.select { |ai| ai.ipv4? }.map { |ai| ai.ip_address }.reject { |s| s == "127.0.0.1" }.first outstream.puts "Start: #{Time.new}" outstream.puts "Using #{myip}" r_resolver=Resolv::DNS.new( :nameserver => [ '10.79.154.80','10.239.138.80' ], :search=>[SUBDOMAIN], :ndots=>1 ) l_resolver=Resolv::DNS.new( :nameserver => [ "#{myip}"], :search=>[SUBDOMAIN], :ndots=>1 ) h={} aws_alist.each { |arecord| host = arecord.split(".")[0] outstream.print "Resolving #{host}: " new_ips = r_resolver.getaddresses(arecord).map { |cls| cls.to_s }.sort m = host.match /internal-(.*)-\d+/ host=m[1] if m #rewrite for internal elb shortname old_ips = l_resolver.getaddresses(host).map { |cls| cls.to_s}.sort outstream.print "-- OLD(#{old_ips.join(",")}) & NEW(#{new_ips.join(",")}) -- " if jdebug if jforce outstream.puts " forced" h[ arecord ] = { :host=>host, :ip=>new_ips } end if new_ips.empty? outstream.puts "No response, skipping" next elsif old_ips == new_ips outstream.puts "Same response, skipping" next else outstream.puts new_ips.join(", ") h[ arecord ] = { :host=>host, :ip=>new_ips } end #empty? } ## zone aws.chowski.local # nsupdate -k Klocalhost.+157+61824.private -v nsupdate_s= "server #{myip}\n" h.each { |arecord, arecord_val | nsupdate_s+= "update delete #{ arecord_val[:host] }.#{SUBDOMAIN} a\n" nsupdate_s+= "update delete #{ arecord_val[:host] }.#{SUBDOMAIN} cname\n" arecord_val[:ip].each { |ip| nsupdate_s+= "update add #{ arecord_val[:host] }.#{SUBDOMAIN} 150 a #{ip}\n" } } nsupdate_s += "send\n" outstream.puts "Writting update script to ./nsupdate.send" IO.write("nsupdate.send", nsupdate_s) cmd="nsupdate -k #{keypath} -v nsupdate.send" outstream.puts "executing: #{cmd}" outstream.puts `#{cmd}` outstream.puts "Stop: #{Time.new}" outstream.close
Until the Robots reduce us to oil for fuel (like the dinosaurs),
Jonathan Malachowski